1 of 10

Important Takeaways on using ML/AI in Practice

Rayid Ghani

2 of 10

Takeaways

3 of 10

All Your Modeling Decisions

Should Reflect How

The AI System Will Be Used

What is your goal (it’s not to build a model)? Constraints?

What are you trying to generalize to?

Who/what is this model going to be applied to? When?

What’s the right metric for “accuracy” (rarely is this AUC or F1)?

What does fairness mean in this context?

4 of 10

Remember to Think

4th-Dimensionally

(Time Matters)

When is your prediction being made?

What information is available for features at that time? What isn’t?

How frequently will the model be updated?

Over what time horizon is your label/outcome occurring?

Leave time to collect labels/outcomes between training and validation

5 of 10

Data != Matrix

Data doesn’t come with labels, you have to create them

Data doesn’t come with features, you have to construct them

Rows in the raw data are rarely the rows in the training/validation matrix

6 of 10

The Defaults Are

Mostly Bad

(aka Models Don’t Give 0/1 Labels)

Models give scores not predicted classes, and these are rarely probabilities

There’s no such thing as absolute accuracy, precision, or recall

A 0.5 score threshold is almost never what you want

Hyperparameters matter and defaults in many packages aren’t great

7 of 10

Compared To What?

What is the right baseline that your model needs to beat to be useful? By how much?

How is this decision currently made (or what commonsense, non-ML approach could be used)?

What is the performance of the current, human-driven decision-making process? How fair is it?

Do explainability methods give you more information than crosstabs and feature importances?

8 of 10

Be Skeptical

Many published results are overstated (at best), not tested on real-world uses

Try things out for yourself, understand their limitations in your context/use case

Interpretability is far from a solved problem. So is algorithmic fairness.

… or for that matter, generalizing to the future, top k classification, etc.

9 of 10

Models Encode Values

Be explicit about what you want the model to do

… and validate that the model actually does this

Understand (and seek out) the perspectives of different stakeholders, people affected by model

Goal is to have a fair overall system, not just a fair model

10 of 10

Recap

Scoping

Data Acquisition

Data Storage

Data Linkage

Data Exploration

Analytical Formulation

ML Pipelines

Baselines

Feature Generation

Train Test Splits

Evaluation Metrics

Modeling

Model Selection

Interpretability

Bias/Fairness

Field Trials

Deployment

Maintenance & Monitoring