1 of 34

LIME

Agnieszka Grala, Zuzanna Kurnicka

2 of 34

LIME - Local Interpretable Model-agnostic Explanations

  • Model agnosticism - gives explanations for any given supervised learning model by treating it as a ‘black box’

  • Local explanations - gives explanations that are accurate around the explained data point.

3 of 34

Key idea

Locally approximate a black-box model by a simpler glass-box model, which is easier to interpret.

4 of 34

Intuition

Goal: understand the factors that influence a black-box model around a single instance of interest

  • Generate an artificial dataset to understand the local behavior of the complex model around the chosen point
  • Fit a simpler glass-box model to the artificial data - locally approximate the predictions of the black-box model
  • Examples of glass-box models: regularized linear models like LASSO regression or decision trees
  • Limit the complexity of the models, so that they are easier to explain.

5 of 34

6 of 34

Method

  • We want to find a model that locally approximates a black-box model f around the instance of interest x. To find the required approximation, we minimize a “loss function”:

  • G - class of simple, interpretable models
  • g - glassbox (surrogate) model
  • Ω(g) - penalty for the complexity of model g
  • π_x - neighborhood of x in which approximation is sought

7 of 34

Data spaces

Note that models f and g may operate on different data spaces.

  • The black-box model

f : X → R

is defined on a large, p-dimensional space X corresponding to the p explanatory variables used in the model.

  • The glass-box model

g : X’ → R 

is defined on a q-dimensional space X’ with q<<p, often called the “space for interpretable representation”.

8 of 34

Algorithm

9 of 34

Data types

  • Images: grid of pixels

  • Text: sequences of characters/words

  • Tabular: Data organized in a table format (rows and columns).

10 of 34

Data space transformation

Text data:

  • Original space (X): Raw text string.
  • Interpretable space ( X’ ): Words or n-grams.
  • Transformation (h): Tokenization.
  • Features in X’ : Typically binary (presence/absence of words/n-grams).

Image data:

  • Original space (X): Pixel array (high-dimensional).
  • Interpretable space ( X’): Superpixels.
  • Transformation (h): Image segmentation.
  • Features in X’ : Typically binary (presence/absence of superpixels).

Tabular data:

  • Original space (X): Mixed continuous and categorical features.
  • Interpretable space ( X’ ): Features derived from original data, made interpretable.

  • Challenge: No single best way to define X’
    • Continuous: Often discretized (e.g., by quantiles, CP profiles) to get categories, or remains continuous.
    • Categorical: High-cardinality categories may be combined.

Leads to various implementations and potentially different results.

11 of 34

Image data space transformation example

  • Example: black-box model (VGG16 network) uses images of the size 244 x 244 pixels with 3 color channels (RGB)
  • Original space dimension : 244 x 244 x 3 (178608-dimensional)
  • Transform into 100 superpixels: binary features that can be turned on or off

  • Black-box f operates on space X = R^(178608)
  • Glass-box g applies to space X’ = {0,1}^100

12 of 34

13 of 34

How do you get the variations of the data?

Perform perturbations on the selected instance:

  • Text and images:

choose random numer of variables and switch (negate) their values

  • Tabular data: depends on X’ definition:
    • Continuous feature: Add random noise (from normal distribution) to the feature's value in X’.
    • Discretized continuous or categorical: Randomly change the category/bin of the feature in X’ to another possible value.

Challenge: Generating realistic perturbations can be difficult due to complex feature correlations.

14 of 34

15 of 34

Weights and Local Neighborhood

Purpose of Weights: To make the interpretable model a local approximation. Prioritize samples closer to the instance being explained.

  • How Weights are Assigned: Based on the proximity (similarity/distance) of the perturbed sample to the original instance in the interpretable space (X’).
  • Kernel Function (e.g. Gaussian) converts the calculated distance into a proximity weight:

Weight=Kernel(Distance)

  • Kernel Width (σ) controls how quickly the weight decreases with distance:� – Small σ: only very close points matter� – Large σ: even far points influence the explanation

16 of 34

Fitting the glass-box model

  • VGG16 network provides 1000 probabilities that the image belongs to one of the 1000 classes used for training the network.
  • Two most likely classes for the image: ‘standard poodle’ (p=0.18) and ‘goose’ (p=0.15).
  • The explanations were obtained with the K-LASSO method (K=15).
  • For each of the selected two classes, the K superpixels with non-zero coefficients are highlighted.

17 of 34

18 of 34

LIME for text data

  • Starting from the original text, new texts are created by randomly removing words from the original text.

  • The dataset is represented with binary features for each word. A feature is 1 if the corresponding word is included and 0 if it has been removed.

19 of 34

Example: Spam classification

  • Data: YouTube comments from five of the ten most viewed videos on YouTube in the first half of 2015. All 5 are music videos.
  • Two classes: spam (1), legit (0)
  • The black-box model is a deep decision tree trained on the document word matrix.

20 of 34

Creating variations of comments

  • Each row is a variation; 1 – word included, 0 – word removed.
  • “prob” - the predicted probability of spam
  • “weight” - the proximity of the variation to the original sentence (calculated as 1 - the proportion of words that were removed)

21 of 34

Explanations

The word “channel” indicates a high probability of spam.

22 of 34

LIME for tabular data

23 of 34

Neighbourhood in tabular data

  • Core Issue: Hard to define a consistent "distance" in X’ when features are mixed types (continuous, categorical).
  • Distance is calculated differently for each feature type (e.g., numerical difference vs. categorical mismatch).
  • These contributions are on vastly different scales (e.g., large squared income difference vs. a 0/1 gender mismatch).
  • Combining them into a single total distance value requires subjective choices about scaling and weighting contributions.

24 of 34

Choosing kernel width

  • Problem: no universal way to find the best kernel width
  • Impact of Different Kernel Widths: Different σ values lead to different sets of weighted samples used for training.
  • Results in different local interpretable models.
  • Can significantly change the resulting explanation for the same instance.
  • In certain scenarios, you can easily turn your explanation around by changing the kernel width

25 of 34

26 of 34

Example: Titanic data

  • Data: passengers’ characteristics
  • 9 variables: gender, age, class, embarked, country, fare, sibsp, parch, survived

27 of 34

Define interpretable space

  • Idea 1: gather similar variables into larger constructs corresponding to some concepts. (class and fare = “wealth”, age and gender = “demography”, …)

  • Idea 2: binary vector with dichotomized variables (age: „<=15.36” and „>15.35”, class: “1st/2nd/deck crew” and “other”, …)

28 of 34

Steps

Black-box model: random forest model

  • Transform data for Johnny D. (instance of interest)
  • Generate artificial data and set weights
  • Use K-LASSO with K=3 to identify three most influential binary variables

29 of 34

30 of 34

Strengths

  • Model-agnostic - does not imply any assumptions about the black-box model structure
  • Interpretable representation - transforms complex data into a simpler, understandable format (e.g., superpixels, words).
  • Local fidelity - the explanations are locally well-fitted to the black-box model.
  • The method has been widely adopted in the text and image analysis, partly due to the interpretable data representation

31 of 34

Limitations

  • Tabular Data Ambiguity: No single best way to handle continuous/categorical features, leading to varying explanations across implementations.
  • Approximates Model, Not Data: No control over the quality of the local fit of the glass-box model to the data (may be misleading).
  • High-Dimensional Challenges: Data sparsity makes defining the local neighbourhood difficult and can lead to explanation instability.

32 of 34

Summary

The most useful applications of LIME are limited to high-dimensional data for which one can define a low-dimensional interpretable data representation, as in image analysis, text analysis, or genomics.

33 of 34

References

  • https://www.sciencedirect.com/science/article/pii/S1566253523001148
  • https://ema.drwhy.ai/
  • https://christophm.github.io/interpretable-ml-book/
  • https://www.geeksforgeeks.org/introduction-to-explainable-aixai-using-lime/

34 of 34

Thank you

<imagine funny meme here>