1 of 26

Getting Started with

Spatial Data Science in Python

University of Minnesota Geocomputing Group 10.24.18

University of Minnesota Day of Data 01.12.18

Bryan C. Runck

Department of Geography, Environment and Society

University of Minnesota

2 of 26

Objectives

  1. Describe why spatial?
  2. Know how to get spatial data into Jupyter
    • !wget
    • Minnesota Geospatial Commons
  3. Perform exploratory spatial data analysis in Python
    • GeoPandas and Seaborn
    • I/O
    • Choropleth maps
    • Scatterplots
  4. Compute spatial autocorrelation
    • PySAL
    • Morans I

3 of 26

Why Spatial?

Describe

4 of 26

What is spatial?

Cholera outbreak

London

5 of 26

What is spatial?

Cholera outbreak

London

6 of 26

What is spatial?

Farm management

Minnesota

7 of 26

What is spatial?

Farm management

Minnesota

8 of 26

Why think spatially?

  1. Knowledge impact
  2. Data requirements
  3. Everyone’s doing it
  4. Good open source tools
  5. Tons of data
  6. In demand skills
  7. Nice people

9 of 26

Spatial Data + Jupyter

Getting Data: Minnesota Geospatial Commons

10 of 26

Link to Jupyter

https://z.umn.edu/spatial-python

Notebook: getting-starting-spatial-data.ipynb

11 of 26

Key functions (Demo)

!wget <insert weblink here>

!ls -l *.shp

!unzip <insert file name here> #if installed

12 of 26

Explore Minnesota Geospatial Commons

Google: Minnesota Geospatial Commons

Find a dataset that is:

  1. Useful
  2. Interesting
  3. Wacky

5 minutes: Report back

  • Types of data, sizes of data, observations

13 of 26

Types of Spatial Data

14 of 26

Types of Spatial Data

15 of 26

Exploratory Spatial Data Analysis with GeoPandas

16 of 26

What is GeoPandas?

Extends Pandas with

  1. Shapely
  2. Fiona
  3. Descartes and matplotlib

Makes mapping easy in Python

17 of 26

Demo

18 of 26

Activity

Utilize the basic ideas we explored related to mapping to:

  1. Identify three potential communities where you would want to stay
  2. Make a map with these three communities highlighted
  3. Challenge: create a linear combination of variables to create an index score of where you would want to stay. For example, the value of community to you could be modeled as:

19 of 26

Spatial Autocorrelation

20 of 26

What is spatial autocorrelation?

21 of 26

How can we measure spatial autocorrelation?

  • Global measures
    • Moran’s I
    • Gamma Index
    • Joint count statistics (binary data)
    • Geary’s C
    • Getis and Ord’s G
  • Local measures
    • Local Moran’s I
    • Local G and G*

22 of 26

Moran’s I

  1. Similar to a correlation coefficient
  2. -1 <= I <= 1
  3. Depends on weights matrices
  4. Null hypothesis: spatial randomness

N, number of spatial units indexed by i, j

x, variable of interest

x_bar, is the mean of x

w_ij, is a spatial weights matrix

W is the sum of the weights

23 of 26

Weights Matrices: Define Neighbors

Defines the extent to which something is included

24 of 26

Demo

25 of 26

Activity

Utilize the basic ideas we explored related to spatial autocorrelation to:

  1. Test spatial autocorrelation across multiple variables and weights
  2. Which variable is the most spatially autocorrelated?
  3. Do you have any hunches as to why there is or isn’t spatial autocorrelation in different variables?

15 minutes: report back

26 of 26

Thank You

Bryan C. Runck

runck014@umn.edu