Materials Informatics 101
A programmatic approach to science
Zachary del Rosario (He/Him)
1
A Programmatic Approach
Programmatic (a): Done using computer code, rather than by hand, especially to support reproducible science
This workshop is about using informatics tools for programmatic materials science
2
Why?
What’s at stake?
What opportunities?
3
Why? and Hello!
What’s at stake?
What opportunities?
Hello!
4
Why: What’s at Stake?
Reproducibility, Credibility, Scientific Progress, etc.
5
REPRODUCIBILITY
CRISIS
6
Crisis in Empirical, Inferential Work
7
Crisis in Empirical, Inferential Work
In psychology, medicine, but surely not in serious physical sciences….
8
Stodden et al. (2014) PNAS
[Text goes here]
9
Stodden et al. (2014) PNAS
[Text goes here]
The journal Science!
10
Why it Matters
Openness is crucial to science
Honest reproducibility is nontrivial
11
Why: What Opportunities?
From cat detectors
to serious science
12
Only ~$20!
13
More Seriously….
Big Tech is using algorithms to spread misinformation, mine your attention, and destroy democracy….
Can we do anything useful with the same techniques?
del Rosario “Why STEM Students Need to Learn Design Refusal” (2021) Liberal Education
14
Yes We Can!
Data Extraction -> Automate tedious tasks
Data Mining -> Find and understand patterns
Prediction -> Fill gaps in our knowledge
15
Yes, Scientists Are Already Doing This!
HT-DFT +
Data Science
16
What We’ll Cover
17
What We’ll Cover
There is so much attention on this topic...
18
What We’ll Cover
Data Science is so much more than just Machine Learning!
19
What We’ll Cover
Quick demo!
20
Today’s Exercises
21
A True Workshop
This is a workshop…
… so you’re going to do some hands-on work!
22
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
23
Quick Orientation!
24
Quick Orientation
Live page for in-workshop activities
Please open this now
25
Exercise Time!
Let’s get to work on Data Extraction and Management
26
Pause for Survey
27
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
28
Software Setup
29
01_python_assignment
Let’s get to work on Intro to Python and Jupyter Notebooks
30
Pause for Survey
31
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
32
An Example: We extracted these data...
That’s weird… there are two “blocks”
33
How Should We Handle These?
34
How Should We Handle These?
35
02_tidy_assignment
Let’s get to work on Intro to Data Wrangling and Tidy Data
36
Pause for Survey
37
Looking Ahead
What’s going on tomorrow?
38
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
Totally Optional!
39
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
40
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
41
Workshop Schedule
Thursday
Tabula +
WebPlotDigitizer
Tidy Data
Python + Jupyter
Fin
Take-Home
Visual Hierarchy
Visualizing Data
Machine Learning
Block 1
Block 2
Block 3
Wrangling Data
Friday
42