1 of 29

Materials Informatics 101

A programmatic approach to science

Zachary del Rosario (He/Him)

1

2 of 29

A Programmatic Approach

Programmatic (a): Done using computer code, rather than by hand, especially to support reproducible science

This workshop is about using informatics tools for programmatic materials science

2

3 of 29

Why?

What’s at stake?

What opportunities?

3

4 of 29

Why? and Hello!

What’s at stake?

What opportunities?

Hello!

  • Zachary del Rosario (he/him)
  • Faculty at Olin College
  • “I help scientists and engineers reason under uncertainty.”

4

5 of 29

Why: What’s at Stake?

Reproducibility, Credibility, Scientific Progress, etc.

5

6 of 29

REPRODUCIBILITY

CRISIS

6

7 of 29

Crisis in Empirical, Inferential Work

7

8 of 29

Crisis in Empirical, Inferential Work

In psychology, medicine, but surely not in serious physical sciences….

8

9 of 29

Stodden et al. (2014) PNAS

[Text goes here]

9

10 of 29

Stodden et al. (2014) PNAS

[Text goes here]

The journal Science!

10

11 of 29

Why it Matters

Openness is crucial to science

  • E.g. “What killed alchemy?”, A. Gelman

Honest reproducibility is nontrivial

  • Programmatic tools help!

11

12 of 29

Why: What Opportunities?

From cat detectors

to serious science

12

13 of 29

Only ~$20!

13

14 of 29

More Seriously….

Big Tech is using algorithms to spread misinformation, mine your attention, and destroy democracy….

Can we do anything useful with the same techniques?

del Rosario “Why STEM Students Need to Learn Design Refusal” (2021) Liberal Education

14

15 of 29

Yes We Can!

Data Extraction -> Automate tedious tasks

Data Mining -> Find and understand patterns

Prediction -> Fill gaps in our knowledge

15

16 of 29

Yes, Scientists Are Already Doing This!

HT-DFT +

Data Science

16

17 of 29

What We’ll Cover

  • Data Extraction
  • Data Management
  • Data Visualization
  • Basics of Machine Learning

17

18 of 29

What We’ll Cover

  • Data Extraction
  • Data Management
  • Data Visualization
  • Basics of Machine Learning

There is so much attention on this topic...

18

19 of 29

What We’ll Cover

  • Data Extraction
  • Data Management
  • Data Visualization
  • Basics of Machine Learning

Data Science is so much more than just Machine Learning!

19

20 of 29

What We’ll Cover

  • Data Extraction
  • Data Management
  • Data Visualization
  • Basics of Machine Learning

Quick demo!

20

21 of 29

Today’s Exercises

21

22 of 29

A True Workshop

This is a workshop

… so you’re going to do some hands-on work!

22

23 of 29

Workshop Schedule

Extract

Wrangle + Tidy

Friday

Saturday

Visualize

Model

Sunday

Monday

Tabula +

WebPlotDigitizer

Python + Jupyter

Concepts

Execution

Concepts

Execution

Concepts

Fin

Focus

Live

Take-Home

23

24 of 29

Exercise and Notebook - In General

  • Live:
    • We’ll do this together, during the 12-2pm block

  • Take-Home:
    • You’ll do this on your own!
    • Office hours available: 4-5pm (same Zoom)

24

25 of 29

Exercise and Notebook - Today

  • Live: Data Extraction
    • Tabula
    • WebPlotDigitizer

  • Take-Home: Python and Jupyter
    • Working in Jupyter
    • Python basics

25

26 of 29

Quick Orientation!

  • Open your browser:

bit.ly/gatw2021

26

27 of 29

Quick Orientation

Live page for in-workshop activities

Please open this now

27

28 of 29

Exercise and Notebook - Today

  • Live: Data Extraction (~12:30 -- 1:30pm)
    • Tabula
    • WebPlotDigitizer
  • Standby for breakout rooms…

Use the “Ask for Help” tool if you need help!

28

29 of 29

Exercise Time!

Let’s get to work on Day 1 (Live)

29