Explainable machine Learning is reading tea leaves
Debugging AlphaGo
Debugging AlphaGo
“I played with Alphago to understand where is the strong points of AlphaGo and where is maybe the weakness.
I played in the morning, afternoon, all time. And I find something. I find big weakness about Alphago. It’s a big one.”
Challenges with Model Debugging
“I played with Alphago to understand where is the strong points of AlphaGo and where is maybe the weakness.
I played in the morning, afternoon, all time. And I find something. I find big weakness about Alphago. It’s a big one.”
Manual
Debugging = Understanding
Debugging = Understanding
One has achieved understanding [of a system] if one can fix a broken implementation of the system.
Debugging = Understanding
Debugging = Understanding
Debugging = Understanding
Debugging = Understanding
The Model Debugging Dream
CI/CD for ML models.
Debugging Unit
ML Developer
Deploy
What is in the debugging unit?
Debugging Unit
ML Developer
Deploy
?
CI/CD for ML models.
Agenda
The Moving Substrate Problem
The Moving Substrate Problem
The model architecture/loss and other components change every few years, and insights from one setting doesn’t seem to translate to another setting.
The Moving Substrate Problem
Change
Input
Model
Prediction
Explain
Explain
Model Architecture Change in NLP
LSTM
Transformers
Change
Input
Model
Prediction
Explain
Explain
Supervised to contrastive loss
Supervised
Loss
Contrastive Loss
Why we should be concerned
Input Gradients of standard models is not discriminative, and exhibits feature leakage.
Why we should be concerned
Input Gradients of standard models is not discriminative, and exhibits feature leakage.
Can we transfer insights from one setting to the other
Standard
Training
PGD
Adversarial
Training
Insights from standard training does not transfer to adversarially trained models.
What other changes in the ML pipeline will make insights not translate from one setting to another?
Canonical Problems
Canonical Problems
A set of problems that the community considers as important and is seeking to fix/address with its methods.
Canonical Problems
Spurious Training Signals
Deep learning models have been shown to:
Spurious Training Signals
Deep learning models have been shown to:
Elephant
Spurious Training Signals
Deep learning models have been shown to:
Elephant
Post hoc explanations for detecting reliance on spurious signals
Post hoc explanations for detecting reliance on spurious signals
Input
Model
Prediction
What parts of the input are ‘most important’ for the model prediction Husky?
Post hoc explanations for detecting reliance on spurious signals
Input
Model
Prediction
What parts of the input are ‘most important’ for the model prediction Husky?
Post hoc explanations for detecting reliance on spurious signals
Input
Model
Prediction
What parts of the input are ‘most important’ for the model prediction Husky?
Model relying on snow to identify Huskies.
Post hoc explanations for detecting reliance on spurious signals
Input
Model
Prediction
What parts of the input are ‘most important’ for the model prediction Husky?
Model relying on snow to identify Huskies.
Collect additional data to fix the bug.
Reality
Source: Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior, Danin, 2022
Noisy Training Labels
Noisy Training Labels
Feature attributions might not be effective for this task.
Source: Debugging Tests for Model Explanations, Adebayo 2020.
Noisy Training Labels
Training point ranking has been shown to be effective in this case.
Source: Scaling up influence functions, Schioppa, 2021
New Methods Tested Against Canonical Problems
When a new method is introduced, test it against the field’s canonical problems:
The Toolbox Approach
The Toolbox Approach
When and for what tasks is a new explainable machine learning tool effective?
Specify in detail:
Guided BackProp as a case study
Compute feature relevance by modifying the backpropagation via positive aggregation.
Guided BackProp as a case study
Input
Predictions
Junco Bird
Guided BackProp
Guided BackProp as a case study
Normal Model Explanation
Successive Inception Blocks
Guided BackProp Explanation Inception-V3 ImageNet
...
Random
Weights
Guided BackProp is invariant to the higher level weights.
Guided BackProp as a case study
Normal Model Explanation
Successive Inception Blocks
Guided BackProp Explanation Inception-V3 ImageNet
...
Random
Weights
Guided BackProp is invariant to the higher level weights…but
Guided BackProp as a case study
Effective for detecting ‘stark’ spurious signals!
Source: Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior, Danin, 2022
Guided BackProp as a case study
Noticed this issue in a previous paper of mine.
Guided BackProp as a case study
Behavior on a re-initialized model depends on the task.
When benchmarking be careful to specify setting in as much detail as possible.
Takeaway
Interpretability-by-design
Interpretable-by-design approaches
Interpretable-by-design approaches
Complex Systems are Pervasive in other Disciplines
Nuclear Power Plants
Watch Making
Let’s translate insights from these to ML debugging.
Requirements for debugging a complex system
Interaction Requires that the end-user understand model components
Model Debugging Via Interaction with Domain Expert
Model
Domain Expert
Concept
How can you debug a system whose concept representation you don’t understand?
Recap