Context Sight: Model Understanding and Debugging via Interpretable Context
Jun Yuan
New York University
Enrico Bertini
Northeastern University
HILDA Paper Mentor: Minsuk Kahng
Oregon State University
Why did the model make such “guess”?
What did I draw?
90%
Pear
AI model gives a guess
88%
Onion
80%
Potato
70%
Avocado
Closest match Pear
Closest match Onion
Closest match Potato
*Example inspired by:
Cai, C.J., Jongejan, J. and Holbrook, J., 2019, March. The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th international conference on intelligent user interfaces (pp. 258-262).
Example-based Explanations!
Background: Using Examples to Explain Model Predictions
[1] Kim, Been, Rajiv Khanna, and Oluwasanmi O. Koyejo. “Examples are not enough, learn to criticize! Criticism for interpretability.” Advances in Neural Information Processing Systems (2016).
[2] Verma, S., Dickerson, J. and Hines, K., 2020. Counterfactual explanations for machine learning: A review. arXiv preprint arXiv:2010.10596.
[3] Peterson, L.E., 2009. K-nearest neighbor. Scholarpedia, 4(2), p.1883.
Examples Are Useful for Model Understanding
[Cai et al., 2019]
Examples can serve as explanations for algorithmic behaviors for laypersons.
[Bove et al., 2022]
Contextualization improves understanding of feature importance from non-expert users.
However, there is a lack of systematic investigation to answer
1. What factors are taken into consideration when using examples for model understanding and debugging?
2. How does examples help with model understanding and debugging in practice?
Understanding Model Behaviors via Context
We define the Context of an instance as:
A set of instances (examples) selected or generated according to certain criteria, in order to understand how the model makes the prediction on this instance.
Our Contribution
Literature Review
Answer “What factors are taken into consideration when using examples for model understanding and debugging?”
Step 1
Context Usage: A Literature Review
We conducted initial analysis of a collection of 20 papers involving the usage of context to understand and debug an ML model.
Analysis Result: Interpretable Context
Interpretable Context
Context Generation
Context Summarization
Similarity
Model Output
Low-level: instance
Mid-level: distribution
High-level: auto summary
Source
Data-driven Similarity
Model-driven Similarity
Original Prediction
Desired Prediction (e.g., counterfactuals)
All Prediction (e.g., nearest-neighbors)
Existing Data
Generated Data
Analysis Result: Interpretable Context
Interpretable Context
Context Generation
Context Summarization
Similarity
Model Output
Source
Data-driven
Model-driven
E.g., similar in terms of RGB values
E.g., similar in terms of the features in the last layer of CNN.
Analysis Result: Interpretable Context
Interpretable Context
Context Generation
Context Summarization
Similarity
Source
Model Output
Same Prediction
Different Prediction (e.g., counterfactuals)
All Prediction (e.g., nearest-neighbors)
Analysis Result: Interpretable Context
Source
Existing Data
Generated Data
Interpretable Context
Context Generation
Context Summarization
Similarity
Model Output
Analysis Result: Interpretable Context
Interpretable Context
Context Generation
Context Summarization
Low-level: e.g., instance
Mid-level:
e.g, distribution
High-level:
e.g., summary
“People who are under 30 years old and whose BMI is under 35 will be predicted healthy by the diabetes prediction model.”
For a specific task, how can interpretable context help?
We use Model Debugging as an example.
Prototyping
Design a prototype to support a specific task based on context.
Step 2
Literature Review
Answer “What factors are taken into consideration when using examples for model understanding and debugging?”
Step 1
From Taxonomy to Design Goals of Model Debugging
G1: Customize parameters to find neighbors of an instance from training data.
(to check how the model learn from similar cases)
Context Generation
G2: Inspect counterfactuals generated with desired properties.
(to check what the model assumes as the desired class)
From Taxonomy to Design Goals of Model Debugging
G3: Enable users to inspect the visualization the selected instance and its context to visually capture the pattern of the context.
Context Summarization
G4: Provide auto-generated summary to guide users to interpret the context
Context Generation
Context Sight: Prototype User Interface
Usage Scenario
Data: Home Equity Line of Credit (HELOC) Dataset
(FICO xML Challenge)
Model: Multi-layer Perceptron, accuracy: 72.67%
Context Sight
Select an instance predicted as Default, but should be Not Default.
Context Generation
Setting desired properties to search nearest neighbors (G1).
Setting desired properties to generate counterfactuals (G2).
Context Visualization (G3): Data Table
Context Visualization (G3) : Parallel Coordinates
The buttons control what to be shown in the parallel coordinates.
Context Visualization (G3): Feature
Each feature is represented by 3 axes
% Trades w/ Balance
Example: A loan applicant has % Trades w/ Balance = 80, the applicant is predicted as Will Default, but should be Will Not Default.
1st Axis: The feature value in the original instance, and that in the counterfactual.
The model seems learned a too low threshold for % Trades with Balance for applications will not default.
Context Visualization (G3): Feature
% Trades w/ Balance
2nd + 3rd Axis: Context examples with the predicted class, and those with the desired class (ground truth).
Scatterplot
Histogram
The context of this error predicted instances seems have more Default cases around 80, but less Not Default around 80.
This may be relevant to why the error happens.
80
80
Auto Summarization of Context (G4)
The auto summary is generated based on the entropy of the predicted classes in the subgroup.
→ Where the model makes consistent predictions
→ Guide users to check these feature ranges to reason about the error (G4)
Conclusion
Future Work
We plan to conduct an observational study based on Context Sight.
Observational Study
Observe how practitioners use context to understand and debug a model.
Step 3
Prototyping
Design a prototype to support a specific task based on context.
Step 2
Literature Review
Answer “What factors are taken into consideration when using examples for model understanding and debugging?”
Step 1
It remains unknown how practitioners actually use context to understand and debug a model in practice.
Future Work
Specifically, we use Context Sight as a probe to understand the following research questions:
RQ1: How do context examples and summaries help reach the goal of model understanding and debugging?
RQ2: What role does interaction play when using context? Do practitioners have different workflow of using context?
Thanks :)