1 of 18

Explainability Across

the AI Lifecycle

Engineering Trust, Reproducibility, and Accountability

in Development Measurement

Dr. Mohammed Ba-Aoum

Blue Cross Blue Sheild NC | NIH

Presented at MeasureDev 2026 | World Bank

2 of 18

Acceptance of black-box models, if it ever did happen, would be a strange, technocratic coup in which modellers have gained the power to shape the assumptions, perceptions, and conclusions of decision-makers in a way the decision-makers themselves do not quite understand.

Meadows & Robinson, 1985 — written forty years before the current debate, still unresolved.

3 of 18

Dr. Mohammed Ba-Aoum · MeasureDev 2026 · 4/22

Roadmap of This Talk

01

Why Explainability Matters

Business · Regulation · Development Contexts

02

Key Thesis & Conceptual Spectrum

The Right Reason · Interpretability → Understanding

03

Three Levels of Explainability

Data · Model · Outcome

04

Techniques & XAI Lifecycle

LIME · SHAP · By-Design vs Post-Hoc

05

System-Level Explainability

From Model to System Accountability

06

LLMs & Open Science

New Frontier · Development Contexts

4 of 18

Dr. Mohammed Ba-Aoum · MeasureDev 2026 · World Bank · 2/9

The Core Problem & Key Thesis

The Problem

AI is embedded in healthcare, development policy, finance, and social systems, but understanding has not kept pace.

WHAT

Prediction

WHY

Explanation

This gap creates a fundamental tension between performance and accountability , felt most acutely in development contexts where decisions affect vulnerable populations.

The Challenge

"Did the model predict correctly ,

or for the right reason?"

A model can achieve high accuracy and still:

⚖️ Produce biased outcomes against vulnerable groups

🔗 Rely on misleading patterns that break in the field

💥 Fail when context shifts , shortcut models collapse

🚫 Generate unjustifiable decisions to affected people

In development contexts, where data are sensitive, decisions affect vulnerable populations, and institutional trust is paramount, a correct prediction only partially solves the problem. We need to know the why.

5 of 18

Why Explainability Matters

01 ⚖️ Regulation & Governance

EU AI Act mandates documentation, testing & transparency for high-risk systems.

Development datasets (health records, poverty surveys) are subject to HIPAA, GDPR, and national data sovereignty laws.

Org must generate auditable trails to demonstrate compliance

02 🎯 Fairness, Bias & Equity

Models inherit historical biases. Explainability reveals proxy discrimination (location → income) before deployment.

Decisions affect underrepresented populations with limited ability to contest outcomes, the right to explanation is a matter of equity.

Fragile institutional trust: opaque AI systems actively erode the legitimacy needed for effective policy intervention.

03 📊 Trust, Adoption & ROI

Transparent systems drive acceptance among field staff, beneficiaries, and partner governments

Explainability enables faster debugging → reduces the cost of model failure in production environments with high data costs.

ROI framing is more relevant for private sector AI — here, trust and mission alignment are the primary returns.

6 of 18

The Full Spectrum: Three Distinct Concepts

INTERPRETABILITY

By Design (Glass-Box)

Concerns the internal mechanics of the model, answering HOW it arrives at results.

A model is interpretable if a human can consistently predict its output from its structure.

"interpretability is the degree to which a human can understand the cause of a decision" (Miller, 2019)

Example: Linear regression — each coefficient directly quantifies feature influence on the prediction.

Contrast: Deep Neural Networks are not inherently interpretable ("black-box") due to complex, non-linear transformations across thousands of layers.

EXPLAINABILITY

Post-Hoc Justification

Applies techniques to black-box models AFTER training to gain insight into their inner workings.

Global methods (Feature Importance): Which features matter across the entire dataset?

Local methods (LIME, SHAP): Why did the model make this specific prediction for this instance?

Critical for justifying individual high-stakes decisions to affected individuals or regulators.

UNDERSTANDING

Holistic AI Governance

The holistic appreciation of a model's capabilities, limitations, and societal impact.

Synthesizes interpretability + explainability with domain expertise, ethics, and business context.

- Does the model logic make sense?

- Is it fair?

- Does it meet strategic goals?

Without this, an organization runs a complex system whose true value and harms remain opaque.

((Biran & Cotton, 2017; Miller, 2019)

7 of 18

Three Goals of Interpretability

"Interpretability is not the destination — it is an instrument for better modeling, better accountability, and better learning."

01 IMPROVE

Debug, Validate & Enhance the Model

Reveal shortcuts, data leakage, incorrect feature effects.

Example: Feature importance showed snow (not animal features) drove wolf/dog classification. (Ribeiro et al., 2016)

Workflow: Train → Inspect → Identify → Improve features → Retrain.

Interpretability is quality assurance, not just explanation.

02 JUSTIFY

Explain to Stakeholders, Regulators & Affected Individuals

Different stakeholders need different justifications: Creators, Operators, Executors, Decision Subjects, Auditors. (Tomsett et al., 2018)

Decision subjects need recourse: what would change the outcome?

Regulators need auditable trails demonstrating appropriate behavior.

In medical devices: interpretability is part of compliance and approval.

03 DISCOVER

Extract Insights from Models & Data

Models are not only prediction machines — they encode learned relationships.

Example: Churn model reveals drivers (price sensitivity, service quality) enabling targeted intervention, not just risk identification.

Interpretability turns prediction into understanding.

Molnar,(2025)

8 of 18

Three Levels of Explainability

Explainability operates at distinct but interconnected levels — each requiring different methods and serving different goals.

01

DATA LEVEL

Understanding the data before any model is trained

Key Questions:

  • Are there biases or demographic imbalances?
  • Are proxy variables present (e.g., zip code as proxy for race)?
  • Is there data leakage that will inflate performance?
  • Which features are meaningfully related to the outcome?

Methods: Feature distributions, correlation analysis, PDP/PFI on simple surrogates, bias audits

💡 Poor data → misleading models → misleading explanations. Explainability must start here.

02

MODEL LEVEL

Understanding how the model processes inputs & makes predictions

Key Questions:

  • Which features drive overall model behavior?
  • Is the model relying on shortcuts or spurious correlations?
  • Does the model logic align with domain knowledge?
  • Is behavior consistent across demographic subgroups?

Methods: Global: Feature Importance, PDP, ALE, Surrogate Models. Model-specific: attention maps, SHAP global

💡 This is where dangerous shortcuts are caught before deployment.

03

OUTCOME LEVEL

Understanding why a specific individual decision was made

Key Questions:

  • Why was this household excluded from the benefit program?
  • What would need to change to get a different outcome?
  • Can this decision be contested and challenged?

Methods: Local: LIME, SHAP (instance), Counterfactuals, Anchors, ICE

💡 Critical for affected populations, regulators, and auditors. Enables contestability and recourse.

9 of 18

XAI Techniques: A Mental Map

BY DESIGN vs POST-HOC

By Design (Intrinsic)

Restrict model class to ensure transparency. Linear regression, decision trees, logistic regression. Coefficients are the explanation.

Post-Hoc

Train any model, then add interpretation afterward. Works on black-boxes. Can be model-agnostic or model-specific.

MODEL-AGNOSTIC vs MODEL-SPECIFIC

Model-Agnostic

Treat model as black box. SIPA principle: Sample → Intervene → Predict → Aggregate. Works for any algorithm. (Scholbeck et al., 2020)

Model-Specific

Use internal structure (weights, gradients, attention). Powerful but not transferable across model types. Best for neural network internals.

LOCAL vs GLOBAL

Local Methods

Explain one specific prediction. LIME, SHAP (instance), Counterfactuals, Anchors, ICE. Essential for "right to explanation" and auditing individual decisions.

Global Methods

Explain overall model behavior across dataset. PDP, ALE, Permutation Feature Importance, Surrogate Models. Reveal dominant patterns and potential biases.

Scholbeck et al. (2020)

10 of 18

Key Techniques:

LIME — Local Interpretable Model-Agnostic Explanations

HOW:

Generates synthetic data around a specific instance → observes how predictions change → fits a simple interpretable model to those local perturbations.

USE:

Why was this loan rejected? Justifying individual decisions to regulators.

SHAP — SHapley Additive exPlanations

HOW:

Assigns each feature a contribution score based on game-theoretic Shapley values. Works globally (feature importance) and locally (instance-level attribution).

USE:

Credit model auditing: confirming that the model's logic aligns with lending best practices and fairness guidelines.

Counterfactuals — Minimal-Change Explanations

HOW:

Determines the smallest change to input features that would flip the model's prediction — directly answering "What would need to be different?"

USE:

Recourse: rejected loan applicants understand what actions could change the outcome.

PDP / ALE — Partial Dependence & Accumulated Local Effects

HOW:

Show the marginal effect of one or two features on the predicted outcome across the entire dataset. ALE corrects for feature correlations.

USE:

Development analytics: understanding the overall relationship between a development indicator and a predicted outcome across populations.

Molnar (2025)

11 of 18

Key Tensions & Design Principles

Accuracy

vs

Interpretability

Simple models ↑ transparency but ↓ performance.

In high-stakes development contexts, interpretability may matter MORE than marginal accuracy gains. Challenge this trade-off deliberately.

Fidelity

vs

Simplicity

Short explanations necessarily omit causes — useful but incomplete. Good explanations must be simple enough for humans while faithful enough to the model.

Portability

vs

Detail

Model-agnostic methods work for any algorithm (high portability) but sacrifice depth. Model-specific methods are more precise but become obsolete when you switch architectures.

Global

vs

Local

Global methods reveal systemic behavior (bias, dominant patterns).

Local methods explain individual decisions (auditability, recourse). Both are essential — neither is sufficient alone.

Transparency

vs

Gaming Risk

Revealing model logic can enable strategic manipulation (credit scoring). This argues for causal features over proxy features, and thoughtful disclosure policies.

All Explanations

vs

Are Incomplete

There is never one true explanation — only selected explanations.

We choose which story to tell. Explainability can mislead if presented as complete truth.

(Molnar ,2025)

12 of 18

“The assumptions and reasoning behind a decision are not examinable, even to the decider. The logic, if there is any, leading to a social policy , is unclear to most people affected by the policy. As far as the general public and even many policymakers themselves are concerned, today's vital decisions are about as understandable and accessible as if they had been handed down by a Delphic oracle.“ Meadows & Robinson, 1985

.

On the opacity of policy decisions based on mental model

13 of 18

The XAI Lifecycle Framework

Explainability is not a one-time feature — it is a continuous governance process supported by MLOps.

INCEPTION

& DATA

Bias Detection: Analyze feature distributions; identify demographic imbalances before model training.

Feature Selection: Use PDP/PFI on surrogate models to identify relevant variables.

Data Leakage Check: Interpretability reveals features that should not be available at inference.

MODEL

BUILDING

Debugging: LIME/SHAP on individual training instances to expose incorrect learning patterns.

Fairness Auditing: Check model behavior across cohorts before deployment.

Validation: Confirm model logic aligns with domain expertise.

DEPLOYMENT

& INFERENCE

Real-Time Explanation: Local explanations for every live prediction — essential for user trust and regulatory "right to explanation."

User Trust & Adoption: Clear explanations of model reasoning drives acceptance in high-stakes contexts.

MONITORING

& GOVERNANCE

Drift Detection: If feature importance or feature-prediction relationships change, XAI tools flag model drift requiring retraining.

Continuous Auditing: MLOps automates XAI so governance is ongoing, not episodic.

MLOps Integration: Automates XAI across the full lifecycle — ensuring explainability is a continuous, governed process, not a one-off effort.

1 INCEPTION

2 MODEL BUILD

3 DEPLOYMENT

4 MONITORING

14 of 18

Explainability could offer partial Substitute for Transparency

In contexts where data cannot be fully shared (privacy, sovereignty, ethics), explainability functions as a PARTIAL SUBSTITUTE for full transparency — enabling structured auditing, reproducibility checks, and verification of model logic.

🔍 Structured Auditing Without Data Sharing

XAI provides auditable trails — feature attributions, decision paths, bias checks — that allow external reviewers to validate model behavior without accessing raw sensitive data. Critical for development institutions handling population-level surveys or health records.

♻️ Reproducibility via Explanation Documentation

When data cannot be re-shared, documenting the model's learned logic (global feature effects, fairness diagnostics) enables reproducibility checks that would otherwise require the original dataset. Explanations become the reproducible artifact.

🏛️ Institutional Trust in Governance Contexts

Development institutions (World Bank, UN agencies, national statistics offices) require not just accurate predictions but justifiable ones. Explainability connects AI outputs to domain knowledge, policy objectives, and fairness norms — building institutional confidence in AI-powered measurement.

⚖️ Aligning with the Rashomon Insight

Multiple models may achieve similar performance. Explainability helps institutions choose WHICH model to trust — not just which performs best — by evaluating whether the model's learned logic is consistent with theory and values.

15 of 18

From Model Explainability to System Explainability

The critical shift for development institutions — and the culmination of this framework.

MODEL EXPLAINABILITY

Why did this model make this prediction?

SYSTEM EXPLAINABILITY

Is the entire AI system — from data to decision — understandable, trustworthy, and accountable?

System Explainability encompasses:

📊

Bias audits, feature validity, leakage checks — explainability before training begins

🤖

Interpretable or explainable predictions — local and global methods applied continuously

⚙️

MLOps integration — automated XAI across preprocessing, training, deployment, and monitoring

🏛️

Governance structures — who reviews explanations, who can contest decisions, what documentation exists

🎯

Explanations tailored to stakeholders , technical for auditors, accessible for affected populations

📊 Data

🤖 Models

⚙️ Pipelines

🏛️ Institutions

🎯 Decisions

the full pipeline — from data to decision

16 of 18

LLMs & The New Frontier of Interpretability

From Static Explanations to Interactive Understanding

OPPORTUNITIES

CHALLENGES

Natural Language Explanations

LLMs explain complex model behaviors in formats humans naturally understand — reducing the usability barrier of technical tools like saliency maps. Multi-level: from simple to technical.

Interactive & Conversational

"Why did you choose this answer?" or "What would happen if this input changed?" — users engage dynamically, not passively. Strongly preferred by decision-makers over static outputs.

Dataset-Level Understanding

LLMs extend interpretability beyond models to explaining entire datasets — identifying patterns, subgroups, and latent structure through natural language narrative.

Cross-Modality Bridge

LLMs interpret complex domains (genomics, chemistry, images) in human-readable form — enabling interpretability where traditional tools fail.

Hallucination Risk

LLMs produce explanations that are fluent and convincing but factually incorrect or not grounded in the model's actual reasoning. Most critical challenge — directly undermines trust.

Opacity at Scale

Hundreds of billions of parameters — impossible to directly inspect. Many LLMs are API-based, blocking access to weights or gradients needed for mechanistic analysis.

Faithfulness Gap

Natural-language flexibility increases the risk of post-hoc rationalization: explanations that appear coherent but do not logically entail the prediction. (Atanasova et al.)

Evaluation Difficulty

Simply asking if users "like" an explanation is insufficient. Empirical evidence shows explanations can vary from highly beneficial to completely unhelpful depending on context.

( Singh et al. 2024)

17 of 18

Towards Responsible AI in

Development Measurement

01

Explainability is a system-level requirement — spanning data, model, deployment, and monitoring — not a post-hoc add-on.

02

Three goals guide every decision: Improve the model · Justify to stakeholders · Discover insights.

03

In data-constrained environments, explainability is the open-science bridge — enabling auditing, reproducibility, and trust without data sharing.

04

LLMs open a new frontier: interactive, conversational interpretability , but hallucination and faithfulness gaps demand rigorous validation.

05

Interpretability is not about choosing one method — it is about choosing the right combination for your goal and your audience.

Explainability is about making AI decisions understandable, trustworthy, and usable in real-world systems.

18 of 18

A CLOSING REFLECTION

A visitor to the home of Niels Bohr noticed a horseshoe hanging over the door and asked:

“Surely you, a scientist, do not believe in superstitions?”

Bohr replied: “Of course not. But I have been told it works whether you believe in it or not.”

One approach to AI: use it because it works , without needing to understand why.

Bohr could afford that.

In development contexts affecting vulnerable populations

We need to understand how and why it works.