1 of 32

interpretability workshop

github link

(with notebooks)

2 of 32

interpretability depends on context

data

audience

3 of 32

overview

how can we build a simple model?
which features are globally important?
which features are locally important?

4 of 32

5 of 32

How can we build a

simple model?

6 of 32

example 1: sparse integer linear model

struck et al. 2017

7 of 32

example 2: optimal classification tree

bertsimas & dunn 2017

8 of 32

example 3: bayesian rule list

letham et al. 2015

9 of 32

example 4: rulefit

molnar et al. 2019

10 of 32

example 5: (causal) structural equation model

bouttou et al. 2013

11 of 32

example 6: prototypical neural networks

chen et al. 2018

12 of 32

13 of 32

14 of 32

Which features are globally important?

15 of 32

global linear feature importances

no model: (rank) correlation, partial correlation
linear / logistic: coefficients (caution with categorical variables)

16 of 32

tree / tree ensembles: (normalized) total impurity reduction by a feature
neural network / nonlinear svm: None

17 of 32

sobol’s indices (sobol 1993)

finding important variables: no interactions

screening unimportant variables: use interactions

only permutation importance requires a model

permutation importance (breiman 2001)

delta index (borgonovo 2007)

18 of 32

19 of 32

How does the model use different features?

20 of 32

PDP ICE Plot

ALE Plot

molnar 2019

friedman 2001

apley 2016

21 of 32

SHAP-Interact

lundberg et al. 2019

22 of 32

SHAP-Interact

lundberg et al. 2019

xdxd

23 of 32

24 of 32

How can we understand

one prediction?

25 of 32

26 of 32

LIME (ribeiro et al. 2016)

SHAP (lundberg & lee, 2017)

C

27 of 32

Sampling

the problem with local explanations

28 of 32

Sampling 2: conditional sampling can introduce spurious attribution for interactions

Let f(x) = X₁

29 of 32

henin & metayer 2019

30 of 32

easy, effective uncertainty

ensemble uncertainty
quantile loss prediction interval
bayesian methods

31 of 32

What else is out there?

32 of 32

Influence functions - find points which highly influenced a model (koh & liang 2017)
TCAV - see if representations of certain points learned by a DNN are linearly separable (kim et al. 2017)
MMD Critic - find a few points which summarize classes (kim et al. 2016)
ACD - hierarchical interpretations for DNNs (singh et al. 2019)