1 of 8

Neuron Shapley:

Discovering Responsible Neurons

Amirata Ghorbani and James Zou

NeuraIPS 2020

CSG@UEF Reading Club

Xuechen Liu

2022.03.18

2 of 8

Background: Interpretability of DNN

  • The term “Interpretability” needs to be defined first
    • Truthness, fairness, selected partitions……
  • Like a decision tree in ASR, we have to answer several questions to move to different branches, like the right-hand side figure.
  • Approximation is “kind of” an approach towards better interpretability and expressiveness of the model
  • But here, we focus on sensitivity analysis by analyzing components’ contributions in a DNN

3 of 8

Shapley Value

  • Derived from Game theory
  • Measuring the contribution for a group of players and giving payouts fairly
  • Bears several good properties in terms of explaining intermediate attributes
    • Efficiency
    • Symmetry
    • Dummy
    • Additivity
  • Of course it has problems:
    • Computationally expensive (NP-hard)
    • Not “creating a new model”, so no hypothesis test
    • Doesn’t work well when features are correlated (maybe not a big deal?)

Lloyd Shapley, 2012, photo credit: NY Times

4 of 8

Neuron Shapley

  • Goal: Find the contribution of each neuron/filter to the behavior of the network
  • We can reflect the properties of Shapley values into the neural network filters
    • Zero contribution: Neurons removing which have no effect on performance shall be assigned with 0
    • Symmetric elements: Two neurons should have equal contributions assigned if they are exchangeable under every possible setting
    • Additivity in performance metric: The overall performance of the model can be the addition of multiple values computed on different testing points.
  • A neuron’s contribution is then defined as:
    • “The marginal contribution to the performance of every subnetwork S of the original model (normalized by the number of subnetworks with the same cardinality |S|)”
    • The equation is essentially same as the original Shapley definition

5 of 8

Optimizations

  • Computing Shapley values are very expensive
  • Instead of finding everybody’s contribution, the paper considers only top contributors
  • Early Truncation: Only compute shapley values for the most related elements
  • Two sampling methods
    • Monte-carlo estimation: Compute expectation rather than everything one-by-one
    • Adaptive Sampling via multi-arm bandit (MAB): We sample a subset of neurons within a confidence bound

6 of 8

Experiments

  • We consider two networks
  • The Inception-V3, trained on ImageNet
    • 78.1% reported accuracy
    • Shapley value computation is performed on the 17216 filters before the logit layer
  • The SqueezeNet, trained on celebA
    • 98.0% reported accuracy
    • Number of filters: 2976 (missed details)
  • Confidence bounds for adaptive sampling are estimated via Bernstein methods

7 of 8

Experiments

  • TMAB Shapley is much faster, which confirms the efficiency of proposed MAB
  • Seems only a small number of filters/neurons are critical for the performance
  • Also there are numbers of filters which can be vulnerable to adversarial samples, removing them can correspond to fair level of accuracy (but not usable)
  • Shapley values can also be used to detect the “culprit” unfair filters, improving the gender detection accuracy

8 of 8

My Takeaways

  • Neuron Shapley is a flexible framework that can be applied to any types of models.
  • Further characterization models such as SHAP can also be applied to neural networks, and there are some works already
  • But it does not answer all questions Shapley values may bring. I do have question about how it deals exactly with feature correlation, for example.
  • This paper also gives an interesting perspective on interpretability: Interpretability research shall at the end be beneficial for model fine-tuning and repairing