Unveiling the Black Box: A Guide to Explainable and Interpretable ML
Presented by Kristof Juhasz
Formulas, definitions, examples taken from Molnar, C. (2022). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed.). christophm.github.io/interpretable-ml-book/
Presentation Outline
What is Interpretable Machine Learning?�Why do we care?�
Market Research – Research Labs
Market Research:�“Startups”
returns more than 50 results globally.
Interpretability Taxonomy
• Global Interpretability
Models such as decision trees, linear models.
Focus on understanding the entire dataset’s behavior.
•Local Interpretability
Focus on individual predictions (e.g., LIME, SHAP).
Useful when specific decisions need explanation.
Partial Dependence Plots (PDP) - Introduction
The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model (J. H. Friedman 200130). A partial dependence plot can show whether the relationship between the target and a feature is linear, monotonic or more complex.
Examples: PDPs�
Individual Conditional Expectation (ICE)�
Disadvantages of PDPs
Marginal vs Conditional
ALE plots (Accumulated Local Effects)
Example ALE vs PDP
Advantages of ALE
All in all, in most situations it is preferred to ALE plots over PDPs, because features are usually correlated to some extent.�
Permutation Feature Importance
Example of Feature Importance
Shapley values
The Shapley value is the only attribution method that satisfies the properties Efficiency, Symmetry, Dummy and Additivity, which together can be considered a definition of a fair payout.
Advantages
SHAP examples
SHAP interaction values
Implementation packages
The Future of Interpretability
It is much easier to automate interpretability when it is decoupled from the underlying machine learning model. The advantage of model-agnostic interpretability lies in its modularity. We can easily replace the underlying machine learning model.
An already visible trend is the automation of model training. That includes automated engineering and selection of features, automated hyperparameter optimization, comparison of different models, and ensembling or stacking of the models. The result is the best possible prediction model. When we use model-agnostic interpretation methods, we can automatically apply them to any model that emerges from the automated machine learning process.