2 of 34

LIME - Local Interpretable Model-agnostic Explanations

Model agnosticism - gives explanations for any given supervised learning model by treating it as a ‘black box’

Local explanations - gives explanations that are accurate around the explained data point.

3 of 34

Key idea

Locally approximate a black-box model by a simpler glass-box model, which is easier to interpret.

4 of 34

Intuition

Goal: understand the factors that influence a black-box model around a single instance of interest

Generate an artificial dataset to understand the local behavior of the complex model around the chosen point
Fit a simpler glass-box model to the artificial data - locally approximate the predictions of the black-box model
Examples of glass-box models: regularized linear models like LASSO regression or decision trees
Limit the complexity of the models, so that they are easier to explain.

6 of 34

Method

We want to find a model that locally approximates a black-box model f around the instance of interest x. To find the required approximation, we minimize a “loss function”:

G - class of simple, interpretable models
g - glassbox (surrogate) model
Ω(g) - penalty for the complexity of model g
π_x - neighborhood of x in which approximation is sought

7 of 34

Data spaces

Note that models f and g may operate on different data spaces.

The black-box model

f : X → R

is defined on a large, p-dimensional space X corresponding to the p explanatory variables used in the model.

The glass-box model

g : X’ → R

is defined on a q-dimensional space X’ with q<<p, often called the “space for interpretable representation”.

9 of 34

Data types

Images: grid of pixels

Text: sequences of characters/words

Tabular: Data organized in a table format (rows and columns).

10 of 34

Data space transformation

Text data:

Original space (X): Raw text string.
Interpretable space ( X’ ): Words or n-grams.
Transformation (h): Tokenization.
Features in X’ : Typically binary (presence/absence of words/n-grams).

Image data:

Original space (X): Pixel array (high-dimensional).
Interpretable space ( X’): Superpixels.
Transformation (h): Image segmentation.
Features in X’ : Typically binary (presence/absence of superpixels).

Tabular data:

Original space (X): Mixed continuous and categorical features.
Interpretable space ( X’ ): Features derived from original data, made interpretable.

Challenge: No single best way to define X’

Continuous: Often discretized (e.g., by quantiles, CP profiles) to get categories, or remains continuous.
Categorical: High-cardinality categories may be combined.

Leads to various implementations and potentially different results.

11 of 34

Image data space transformation example

Example: black-box model (VGG16 network) uses images of the size 244 x 244 pixels with 3 color channels (RGB)
Original space dimension : 244 x 244 x 3 (178608-dimensional)
Transform into 100 superpixels: binary features that can be turned on or off

Black-box f operates on space X = R^(178608)
Glass-box g applies to space X’ = {0,1}^100

13 of 34

How do you get the variations of the data?

Perform perturbations on the selected instance:

Text and images:

choose random numer of variables and switch (negate) their values

Tabular data: depends on X’ definition:

Continuous feature: Add random noise (from normal distribution) to the feature's value in X’.
Discretized continuous or categorical: Randomly change the category/bin of the feature in X’ to another possible value.

Challenge: Generating realistic perturbations can be difficult due to complex feature correlations.

15 of 34

Weights and Local Neighborhood

Purpose of Weights: To make the interpretable model a local approximation. Prioritize samples closer to the instance being explained.

How Weights are Assigned: Based on the proximity (similarity/distance) of the perturbed sample to the original instance in the interpretable space (X’).
Kernel Function (e.g. Gaussian) converts the calculated distance into a proximity weight:

Weight=Kernel(Distance)

Kernel Width (σ) controls how quickly the weight decreases with distance:� – Small σ: only very close points matter� – Large σ: even far points influence the explanation

16 of 34

Fitting the glass-box model

VGG16 network provides 1000 probabilities that the image belongs to one of the 1000 classes used for training the network.
Two most likely classes for the image: ‘standard poodle’ (p=0.18) and ‘goose’ (p=0.15).
The explanations were obtained with the K-LASSO method (K=15).
For each of the selected two classes, the K superpixels with non-zero coefficients are highlighted.

18 of 34

LIME for text data

Starting from the original text, new texts are created by randomly removing words from the original text.

The dataset is represented with binary features for each word. A feature is 1 if the corresponding word is included and 0 if it has been removed.

19 of 34

Example: Spam classification

Data: YouTube comments from five of the ten most viewed videos on YouTube in the first half of 2015. All 5 are music videos.
Two classes: spam (1), legit (0)
The black-box model is a deep decision tree trained on the document word matrix.

20 of 34

Creating variations of comments

Each row is a variation; 1 – word included, 0 – word removed.
“prob” - the predicted probability of spam
“weight” - the proximity of the variation to the original sentence (calculated as 1 - the proportion of words that were removed)

21 of 34

Explanations

The word “channel” indicates a high probability of spam.

22 of 34

LIME for tabular data

23 of 34

Neighbourhood in tabular data

Core Issue: Hard to define a consistent "distance" in X’ when features are mixed types (continuous, categorical).
Distance is calculated differently for each feature type (e.g., numerical difference vs. categorical mismatch).
These contributions are on vastly different scales (e.g., large squared income difference vs. a 0/1 gender mismatch).
Combining them into a single total distance value requires subjective choices about scaling and weighting contributions.

24 of 34

Choosing kernel width

Problem: no universal way to find the best kernel width
Impact of Different Kernel Widths: Different σ values lead to different sets of weighted samples used for training.
Results in different local interpretable models.
Can significantly change the resulting explanation for the same instance.
In certain scenarios, you can easily turn your explanation around by changing the kernel width

26 of 34

Example: Titanic data

Data: passengers’ characteristics
9 variables: gender, age, class, embarked, country, fare, sibsp, parch, survived

27 of 34

Define interpretable space

Idea 1: gather similar variables into larger constructs corresponding to some concepts. (class and fare = “wealth”, age and gender = “demography”, …)

Idea 2: binary vector with dichotomized variables (age: „<=15.36” and „>15.35”, class: “1st/2nd/deck crew” and “other”, …)

28 of 34

Steps

Black-box model: random forest model

Transform data for Johnny D. (instance of interest)
Generate artificial data and set weights
Use K-LASSO with K=3 to identify three most influential binary variables

30 of 34

Strengths

Model-agnostic - does not imply any assumptions about the black-box model structure
Interpretable representation - transforms complex data into a simpler, understandable format (e.g., superpixels, words).
Local fidelity - the explanations are locally well-fitted to the black-box model.
The method has been widely adopted in the text and image analysis, partly due to the interpretable data representation

31 of 34

Limitations

Tabular Data Ambiguity: No single best way to handle continuous/categorical features, leading to varying explanations across implementations.
Approximates Model, Not Data: No control over the quality of the local fit of the glass-box model to the data (may be misleading).
High-Dimensional Challenges: Data sparsity makes defining the local neighbourhood difficult and can lead to explanation instability.

32 of 34

Summary

The most useful applications of LIME are limited to high-dimensional data for which one can define a low-dimensional interpretable data representation, as in image analysis, text analysis, or genomics.

33 of 34

References

https://www.sciencedirect.com/science/article/pii/S1566253523001148
https://ema.drwhy.ai/
https://christophm.github.io/interpretable-ml-book/
https://www.geeksforgeeks.org/introduction-to-explainable-aixai-using-lime/

1 of 34