1 of 69

Chandan Singh

Useful interpretations for machine learning

in science & medicine

Advisor: Bin Yu

2 of 69

Defining interpretability
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

3 of 69

Input

Prediction

4 of 69

Interpretable ml extracts useful, relevant knowledge

Murdoch*, CS*, Kumbier, Abbasi-Asl, & Yu

PNAS, 2019

5 of 69

Problem, data, audience

Model

Post hoc analysis

Iterate

Predictive accuracy

Descriptive accuracy

Relevancy

6 of 69

Defining interpretability
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

7 of 69

Existing methods don’t capture interactions well (e.g. LIME, SHAP)....

Interpreting + improving neural networks

Hierarchical interactions: ICLR ‘19

CS*, Murdoch*, & Yu

Interaction regularization: ICML ‘20

Rieger, CS, Murdoch, & Yu

Transformation importances: ICLR Workshop ‘21

CS*, Ha*, Lanusse, Boehm, Liu, & Yu

8 of 69

Negative

“not very good”

Prediction

very good

not very good

not

very

good

Positive

Negative

Interpretation

9 of 69

not	very	good

Pred

	very	good

not

=

+

Original input

Relevant part

Irrelevant part

CD importance of very good

=

+

=

+

=

CD importance

Irrelevant

+

10 of 69

Relevant (at layer i)

Linear / conv

Irrelevant (at layer i)

Relu

Pooling

11 of 69

Importances work for

transformations via a reparameterization trick

12 of 69

Evaluation 1: human accuracy at identifying a corrupted model

Human accuracy

13 of 69

Evaluation 2: penalizing importances improves performance

Before penalty

0.67 F1

After penalty

0.73 F1

14 of 69

Defining interpretability
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

15 of 69

Molecular-partner prediction

with interpretable neural networks

CS*, Li*, Ruan, Song, Dang, He, Upadhyayula, & Yu

16 of 69

Clathrin-mediated endocytosis

Cremona 2001

17 of 69

Tracking molecular partners is a central challenge

Hey

Hey partner

...but it’s experimentally difficult

given one partner, we predict the other

X

18 of 69

Fitting a network for spike prediction

Time

Amplitude

Auxilin

Aguet et al

‘13, ‘16

Binary classification: spike?

Clathrin

Spike!

19 of 69

Interpretation

Positive

Negative

Prediction

Spike!

20 of 69

Interpreting individual predictions with CD

Positive

Negative

21 of 69

What is interpretability?
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

DALLE-2

Ha, CS, Lanusse, Upadhyayula, & Yu NeurIPS `21

22 of 69

Prediction

Our interpretation is incomplete

Thousands of parameters….

23 of 69

Solution: distillation

Prediction

<20 parameters!

24 of 69

Wavelet transform

z

x

Wavelet function

25 of 69

Wavelet function can vary

x

z

small

Orthonormal basis under these conditions (mallat, 1998):

Haar

Mexican hat

db4

sym4

26 of 69

Adaptive wavelet

x

z

27 of 69

Adaptive wavelet + distillation

28 of 69

29 of 69

AWD is accurate and interpretable

x

z

Input

Wavelet reconstruction

Prediction

AWD	Neural network	Baseline
0.263	0.237	0.197

R²

30 of 69

AWD works on a completely different problem

AWD	Neural network	Roberts cross*
1.029	1.156	1.259

x 10^-4

*Ribli et al (2019) Nature Astronomy

(RMSE)

31 of 69

Defining interpretability?
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

32 of 69

Clinical-decision rule modeling

Kornblith*, CS*, Devlin, Addo, Streck, Holmes,

Kuppermann, Grupp-Phelan, Fineman, Butte, & Yu

203/12044

6/5034

112/1963

38/826

36/2532

6/955

2/305

1/34

2/395

Abdo trauma / SBS

GCS 3-13

Abdo tenderness

Thoracic wall trauma

Abdo pain

↓ or no BS

Vomiting

33 of 69

FIGS: Fast interpretable greedy-tree sums

Tan*, Singh*, Nasseri, Agarwal, & Yu

34 of 69

Number of rules

35 of 69

FIGS

Trees compete with each other to predict the outcome

+

36 of 69

i

models

Singh*, Nasseri*, Tan, Tang, & Yu

JOSS, ‘21

: a python library for interpretability

⭐ 700+

37 of 69

Acknowledgements

38 of 69

Acknowledgements

39 of 69

Thesis committee

Internship advisors

Undergrad advisors

40 of 69

Yu Group

41 of 69

Berkeley friends

42 of 69

Old friends

43 of 69

Family

44 of 69

Thank you

Defining interpretability
Interpreting neural nets
Cell-biology problem
Interpretable distillation
Rule-based models

45 of 69

Interpreting individual predictions with CD

Positive

Negative

Positive

Negative

Unsure

Time

46 of 69

Random stuff

w/ Alejo Rico-Guevara, Xin Cheng

w/ Gang-Yu Liu, Jiali Zhang

w/ Mike Eickenberg, Reza Abbasi-Asl, Mike Oliver

w/ Summer Devlin, Jamie Murdoch

w/ Raaz Dwivedi, Martin Wainwright

w/ Yu-group

w/ James Duncan, Rush Kapoor, Sahil Saxena

47 of 69

Hierarchical shrinkage for trees

Agarwal*, Tan*, Ronen, Singh, & Yu

arXiv, submitted to ICML ‘22

48 of 69

Covid county-level

forecasting / data curation

Yu Group + Response4Life, HDSR 2021

49 of 69

Evaluating interpretation: simulations

X

Y

Training

Importance

CD	Deeplift	Shap	IG
0.4	3.6	4.0	4.2

Error (%)

50 of 69

FIGS can potentially improve clinical decision rules

IAI

CSI

TBI

51 of 69

What is interpretability?
Interpreting neural nets
Molecular partner prediction
Distilling interpretable models
Rule-based models

Ha, Singh, Lanusse, Upadhyayula, & Yu, NeurIPS `21

52 of 69

What is interpretability?
Interpreting neural nets
Molecular partner prediction
Distilling interpretable models
Rule-based models

DALLE-2

53 of 69

bin yu

jamie murdoch

karl kumbier

reza

abbasi-asl

aaron kornblith

françois lanusse

wooseok ha

vanessa boehm

gokul

upadhyayula

xiongtao

ruan

xiao

li

yu group

yan shuo tan

Raaz Dwivedi

laura rieger

54 of 69

Definitions

Interpretability is the “ability to explain or to present in understandable terms to a human” (doshi-velez & kim, 2017)

Related to trust, causality, transferability, informativeness, fairness (lipton 2017)

“Explanations are... the currency in which we exchanged beliefs” (lombrozo 2006)

55 of 69

Interpretations should be useful

Improve predictive accuracy

Uncertainty

Make causal recommendations

56 of 69

not	very	good

	very	good

not

Original input

Baseline importance of very good

pred

Relevant part

pred2

pred3

Irrelevant part

57 of 69

Molecular partners underlie biological processes

Images by DALLE 2

58 of 69

Predicting via peak counts

Predict using nearest-neighbor

Filtered values

at the local maxima

Binning into histogram

59 of 69

Ex: Pneumonia-death prediction

Caruana et al. 2015

Spurious correlation: asthma correlated with lower risk in the training data

60 of 69

cell biology

science &

medicine

cosmological inference

neuroscience

clinical decision

rules

interaction summarization

interpreting neural nets

interpretable distillation

explanation penalization

building

interpretable models

rule-based algorithms

software

61 of 69

Back to biology: interpreting many predictions

62 of 69

Back to biology: interpreting a DNN prediction

63 of 69

transformation importance

singh, ha, lanusse, boehm, & yu

ICLR 2020 workshop

64 of 69

65 of 69

66 of 69

67 of 69

68 of 69

evaluating (hierarchical) CD importance

qualitative

human experiments

improving models

recovering groundtruth

and more...

69 of 69

Evaluating interpretation: penalizing scores improves performance

Unpenalized

0.67 (f1)

Penalized

0.73 (f1)

Patch becomes less important