Working With Data
Advantages of a programmatic approach
Zachary del Rosario (He/Him)
1
Workshop Schedule
Extract
Wrangle + Tidy
Friday
Saturday
Visualize
Model
Sunday
Monday
Tabula +
WebPlotDigitizer
Python + Jupyter
Concepts
Execution
Concepts
Execution
Concepts
Fin
Focus
Live
Take-Home
2
An Example: We extracted these data...
That’s weird… there are two “blocks”
3
How Should We Handle These?
4
How Should We Handle These?
5
Power of a Programmatic Approach
I’m going to give you some ideas on how a programmatic approach to data can help!
6
Programmatic Data Management
Beyond spreadsheets
7
Comparison: Two Workflows
Spreadsheet
Programmatic
8
Comparison: Two Workflows
Spreadsheet
Programmatic
9
Comparison: Two Workflows
Spreadsheet
Programmatic
10
What Does a Programmatic Approach Look Like?
11
What Does a Programmatic Approach Look Like?
Raw data subdirectory
12
What Does a Programmatic Approach Look Like?
Cache figures, for publications!
13
What Does a Programmatic Approach Look Like?
Processed data
subdirectory
Raw data subdirectory
14
What Does a Programmatic Approach Look Like?
Filenames imply execution order
15
What Does a Programmatic Approach Look Like?
Common utils
keeps notebooks
clean
16
What Does a Programmatic Approach Look Like?
Load raw data
into notebook
17
What Does a Programmatic Approach Look Like?
Load raw data
into notebook
Process raw into working data
18
Note Well….
19
Back to the Weibull Data
I wouldn’t edit the spreadsheet!
I would write a processing notebook
(you will!)
20
Tidy Data
21
An Example: We extracted these data...
That’s weird… there are two “blocks”
How do we handle this?
22
23
An Example: We extracted these data...
That’s weird… there are two “blocks”
Rows have two observations each (not one)
24
A Magnificent Function!
Q: How do we fix this “two observation” problem?
A: With pivoting functions
pivot_longer - Tidyverse (R)
tf_pivot_longer - Grama (Python)
In today’s Live Exercise!
25
Today’s Exercises
26
Exercise and Notebook
27
If Your Install Isn’t Working...
I’ve prepared a Google Colab option:
https://github.com/zdelrosario/mi101-colab
Will paste this in chat
NB. Also linked from MI101 Workshop site
28
Today’s Exercise
29