Understanding Black-box Predictions via Influence Functions
Alex Adam, Keiran Paster, Jenny (Jingyi) Liu
4/1/2021
CSC2541
Paper by: Pang Wei Koh, Percy Liang
Outline
Introduction to Influence Functions
Influence of a training input
where
Training set
Loss on single point:
, full loss:
Empirical risk minimizer:
Influence of a training input
where
Training set
Loss on single point:
, full loss:
Empirical risk minimizer:
How does our model’s predictions change if we remove training example ?
Influence of a training input
Key idea: more generally, upweighting a point by yields:
Influence of a training input
Key idea: more generally, upweighting a point by yields:
Effect on the parameters:
Influence of a training input
Key idea: more generally, upweighting a point by yields:
Effect on the parameters:
Influence of a training input
Key idea: more generally, upweighting a point by yields:
Effect on the parameters:
Effect on the loss for a single test example:
Perturbing a training input
Want to find:
Efficiently calculating influence
Precompute for each test example:
Naive computation:
Influence functions vs leave-one-out retraining
Logistic regression on MNIST
Non-convergent, non-convex setting
Influence functions vs leave-one-out retraining
Smooth approximations to hinge loss
Uses Cases of Influence Functions
Understanding Model Behavior
Understanding Model Behavior
Adversarial Training Examples
Adversarial Training Examples
Fixing Mislabeled Examples
Fixing Mislabeled Examples