3 of 54

Debugging AlphaGo

“I played with Alphago to understand where is the strong points of AlphaGo and where is maybe the weakness.

I played in the morning, afternoon, all time. And I find something. I find big weakness about Alphago. It’s a big one.”

4 of 54

Challenges with Model Debugging

“I played with Alphago to understand where is the strong points of AlphaGo and where is maybe the weakness.

I played in the morning, afternoon, all time. And I find something. I find big weakness about Alphago. It’s a big one.”

Manual

5 of 54

Debugging = Understanding

6 of 54

Debugging = Understanding

One has achieved understanding [of a system] if one can fix a broken implementation of the system.

7 of 54

Debugging = Understanding

If you can ‘debug’ a system you understand a system,

8 of 54

Debugging = Understanding

If you can ‘debug’ a system you understand a system,

The debugging process involves detecting and fixing a system’s mistakes,

9 of 54

Debugging = Understanding

If you can ‘debug’ a system you understand a system,

The debugging process involves detecting and fixing a system’s mistakes,

The goal of explainable ML is to get an end-user to understand a ML model,

10 of 54

Debugging = Understanding

If you can ‘debug’ a system you understand a system,

The debugging process involves detecting and fixing a system’s mistakes,

The goal of explainable ML is to get an end-user to understand a ML model,

Then, explainable ML = model debugging.

11 of 54

The Model Debugging Dream

CI/CD for ML models.

Debugging Unit

ML Developer

Deploy

12 of 54

What is in the debugging unit?

Debugging Unit

ML Developer

Deploy

Feature Attributions
Concept Activation Methods
Training Point Ranking
Something else?

CI/CD for ML models.

13 of 54

Agenda

The Moving Substrate Problem: can insights from explainability/interpretability keep up with changes to the ML pipeline?

Canonical Problems: Interpretable/Explainable ML needs canonical problems.

Toolbox Approach: When and for what task is an explainability/interpretability tool effective?

The Interpretable-by-design elephant in the room.

14 of 54

The Moving Substrate Problem

15 of 54

The Moving Substrate Problem

The model architecture/loss and other components change every few years, and insights from one setting doesn’t seem to translate to another setting.

16 of 54

The Moving Substrate Problem

17 of 54

Change

Input

Model

Prediction

Explain

Model Architecture Change in NLP

LSTM

Transformers

18 of 54

Change

Input

Model

Prediction

Explain

Supervised to contrastive loss

Supervised

Loss

Contrastive Loss

19 of 54

Why we should be concerned

Input Gradients of standard models is not discriminative, and exhibits feature leakage.

20 of 54

Why we should be concerned

Input Gradients of standard models is not discriminative, and exhibits feature leakage.

21 of 54

Can we transfer insights from one setting to the other

Standard

Training

PGD

Adversarial

Training

Insights from standard training does not transfer to adversarially trained models.

What other changes in the ML pipeline will make insights not translate from one setting to another?

22 of 54

Canonical Problems

23 of 54

Canonical Problems

A set of problems that the community considers as important and is seeking to fix/address with its methods.

24 of 54

Canonical Problems

Spurious Training Signals

Noisy Labels

Worst-Performing Subgroup (‘Fairness’)

25 of 54

Spurious Training Signals

Deep learning models have been shown to:

rely on backgrounds for object classification [Xiao et. al. 2020];

26 of 54

Spurious Training Signals

Deep learning models have been shown to:

rely on backgrounds for object classification [Xiao et. al. 2020];

rely image texture [Geirhos et. al. 2018];

Elephant

27 of 54

Spurious Training Signals

Deep learning models have been shown to:

rely on backgrounds for object classification [Xiao et. al. 2020];

rely image texture [Geirhos et. al. 2018];

easily learn spurious signals from just 3 training examples [Yang et. al. 2022]!

Elephant

28 of 54

Post hoc explanations for detecting reliance on spurious signals

29 of 54

Post hoc explanations for detecting reliance on spurious signals

Input

Model

Prediction

What parts of the input are ‘most important’ for the model prediction Husky?

30 of 54

Post hoc explanations for detecting reliance on spurious signals

Input

Model

Prediction

What parts of the input are ‘most important’ for the model prediction Husky?

31 of 54

Post hoc explanations for detecting reliance on spurious signals

Input

Model

Prediction

What parts of the input are ‘most important’ for the model prediction Husky?

Model relying on snow to identify Huskies.

32 of 54

Post hoc explanations for detecting reliance on spurious signals

Input

Model

Prediction

What parts of the input are ‘most important’ for the model prediction Husky?

Model relying on snow to identify Huskies.

Collect additional data to fix the bug.

33 of 54

Reality

Source: Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior, Danin, 2022

34 of 54

Noisy Training Labels

Source: https://www.surgehq.ai/blog/30-percent-of-googles-reddit-emotions-dataset-is-mislabeled

35 of 54

Noisy Training Labels

Feature attributions might not be effective for this task.

Source: Debugging Tests for Model Explanations, Adebayo 2020.

36 of 54

Noisy Training Labels

Training point ranking has been shown to be effective in this case.

Source: Scaling up influence functions, Schioppa, 2021

37 of 54

New Methods Tested Against Canonical Problems

When a new method is introduced, test it against the field’s canonical problems:

Spurious Training Signals

Noisy Labels

Worst-Performing Subgroup (‘Fairness’)

38 of 54

The Toolbox Approach

39 of 54

The Toolbox Approach

When and for what tasks is a new explainable machine learning tool effective?

Specify in detail:

Data modality: images, text, tabular, audio etc.
Model Architecture: Convolutional, Transformer?
Loss: Log-loss, Contrastive?
Training Type: Standard or Adversarial?
End-User: ML researcher, lay person?

40 of 54

Guided BackProp as a case study

Compute feature relevance by modifying the backpropagation via positive aggregation.

Source: Springenberg & Dosovitskiy et. al. 2015

41 of 54

Guided BackProp as a case study

Input

Predictions

Junco Bird

Guided BackProp

42 of 54

Guided BackProp as a case study

Randomize (re-initialize) model parameters starting from top layer all the way to the input.

Normal Model Explanation

Successive Inception Blocks

Guided BackProp Explanation Inception-V3 ImageNet

...

Random

Weights

Guided BackProp is invariant to the higher level weights.

43 of 54

Guided BackProp as a case study

Randomize (re-initialize) model parameters starting from top layer all the way to the input.

Normal Model Explanation

Successive Inception Blocks

Guided BackProp Explanation Inception-V3 ImageNet

...

Random

Weights

Guided BackProp is invariant to the higher level weights…but

44 of 54

Guided BackProp as a case study

Effective for detecting ‘stark’ spurious signals!

Source: Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior, Danin, 2022

45 of 54

Guided BackProp as a case study

Noticed this issue in a previous paper of mine.

46 of 54

Guided BackProp as a case study

Behavior on a re-initialized model depends on the task.

When benchmarking be careful to specify setting in as much detail as possible.

47 of 54

Takeaway

We need careful benchmarking, both when introducing a new method and evaluating current ones.

Specify the setting in as much detail as possible:

Data modality: images, text, tabular, audio etc.
Model Architecture: Convolutional, Transformer?
Loss: Log-loss, Contrastive?
Training Type: Standard or Adversarial?

48 of 54

Interpretability-by-design

49 of 54

Interpretable-by-design approaches

50 of 54

Interpretable-by-design approaches

51 of 54

Complex Systems are Pervasive in other Disciplines

Nuclear Power Plants

Watch Making

Let’s translate insights from these to ML debugging.

52 of 54

Requirements for debugging a complex system

Each part of the system should have a specific function, and the function should be known.

When something goes wrong:

Isolate each component and check if it is working,
Isolate the interconnection between each component and check if it is working.

53 of 54

Interaction Requires that the end-user understand model components

Model Debugging Via Interaction with Domain Expert

Model

Domain Expert

Concept

How can you debug a system whose concept representation you don’t understand?

54 of 54

Recap

The Moving Substrate Problem: can insights from explainability/interpretability keep up with changes to the ML pipeline?

Canonical Problems: Interpretable/Explainable ML needs canonical problems.

Toolbox Approach: When and for what task is an explainability/interpretability tool effective?

The Interpretable-by-design elephant in the room.