1 of 11

Beyond LLMs: Transparency, Ethics, and Interpretability in Geospatial Artificial Intelligence

Virginia Ziulu

Data Scientist

2 of 11

AI ethics discussions are dominated by LLMs

AI Chatbots Will Never Stop Hallucinating

Scientific American - April 5, 2024

AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?

Live Science - June 21, 2025

Toward a Theory of AI Errors: Making Sense of Hallucinations, Catastrophic Failures, and the Fallacy of Generative AI

Harvard Data Science Review – November 25, 2024

America Isn’t Ready for What AI Will Do to Jobs

The Atlantic – February 10, 2026

Can AI replace junior workers?

The Economist – October 13, 2025

Researchers uncover AI bias against older working women

Stanford Report – October 17, 2025

3 of 11

Yet AI is also widely used to analyze images and

spatial data

Cropland

Built-up

Snow

.

.

.

InceptionV3 network architecture. Source: Google.

4 of 11

GeoAI is already being used to generate evidence for evaluation and policy analysis

Source: Tirana Learning Engagement, Independent Evaluation Group (IEG)

AI Method (Semantic Segmentation, Computer Vision)

5 of 11

GeoAI-derived indicators can fail in ways that are difficult to detect

Urban Fabric

(Top: Europe – Bottom: Africa)

Informal Settlement

(Top: North America – Bottom: Africa)

Rural Farmland

(Top: Europe – Bottom: SE Asia)

Images Source: Google Earth Pro.

6 of 11

GeoAI systems can reproduce spatial and demographic biases embedded in their training data

Source: Gevaert, Caroline M., Thomas Buunk, and Marc JC Van Den Homberg. "Auditing geospatial datasets for biases: Using global building datasets for disaster risk management." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 17 (2024): 12579-12590.

7 of 11

The processes underlying GeoAI-derived predictions are often not fully transparent or interpretable

Interpretability

Prediction Accuracy

Linear/

statistical models

Decision

Trees

Random

Forests

Deep Learning/

GeoAI

8 of 11

Advances in generative AI are making synthetic satellite imagery increasingly plausible

Real

Fake

Real

Real

Fake

Fake

Source: Ziulu, Virginia, and James Garforth. "Advancing Deepfake Detection in RGB Satellite Imagery Through Domain-Specific Ensembles“. 

IGARSS 2025-2025 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2025.

9 of 11

GeoAI introduces distinct epistemic and transparency risks

Responsible use requires stronger validation frameworks

Accuracy metrics are only the first layer of GeoAI validation, not the endpoint.

Data & model validation

→ Is the underlying data and model performance reliable?

(dataset documentation, provenance, accuracy metrics, F1, IoU)

Robustness/generalization testing

→ Are results stable under distribution shift?

(temporal transfer, spatial transfer, sensitivity tests)

Context/geographic validation

→ Are outputs valid across places and populations?

(bias checks, subgroup analysis, cross-region comparison)

Decision validation

→ Are outputs trustworthy for real-world decisions?

(expert judgment, triangulation, interpretability, domain knowledge)

10 of 11

GeoAI introduces distinct epistemic and transparency risks

Responsible use requires stronger validation frameworks

Accuracy metrics are only the first layer of GeoAI validation, not the endpoint.

Data & model validation

→ Does the model correctly segment vegetation in known benchmarks?

(IoU, accuracy, pretrained model performance)

Robustness/generalization testing

→ Does segmentation remain stable under visual variation?

(lighting, seasonality, image quality)

Context/geographic validation

→ Does “greenery” behave consistently in Tirana vs training cities? (urban morphology, vegetation types, informal greenery)

Decision validation

→ Are greenery indicators meaningful for urban evaluation in Tirana? (urban planning relevance, local interpretation, expert judgment)

11 of 11

Thank You.