Beyond LLMs: Transparency, Ethics, and Interpretability in Geospatial Artificial Intelligence
Virginia Ziulu
Data Scientist
AI ethics discussions are dominated by LLMs
AI Chatbots Will Never Stop Hallucinating
Scientific American - April 5, 2024
AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?
Live Science - June 21, 2025
Toward a Theory of AI Errors: Making Sense of Hallucinations, Catastrophic Failures, and the Fallacy of Generative AI
Harvard Data Science Review – November 25, 2024
America Isn’t Ready for What AI Will Do to Jobs
The Atlantic – February 10, 2026
Can AI replace junior workers?
The Economist – October 13, 2025
Researchers uncover AI bias against older working women
Stanford Report – October 17, 2025
Yet AI is also widely used to analyze images and
spatial data
Cropland
Built-up
Snow
.
.
.
InceptionV3 network architecture. Source: Google.
GeoAI is already being used to generate evidence for evaluation and policy analysis
Source: Tirana Learning Engagement, Independent Evaluation Group (IEG)
AI Method (Semantic Segmentation, Computer Vision)
GeoAI-derived indicators can fail in ways that are difficult to detect
Urban Fabric
(Top: Europe – Bottom: Africa)
Informal Settlement
(Top: North America – Bottom: Africa)
Rural Farmland
(Top: Europe – Bottom: SE Asia)
Images Source: Google Earth Pro.
GeoAI systems can reproduce spatial and demographic biases embedded in their training data
Source: Gevaert, Caroline M., Thomas Buunk, and Marc JC Van Den Homberg. "Auditing geospatial datasets for biases: Using global building datasets for disaster risk management." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 17 (2024): 12579-12590.
The processes underlying GeoAI-derived predictions are often not fully transparent or interpretable
Interpretability
Prediction Accuracy
Linear/
statistical models
Decision
Trees
Random
Forests
Deep Learning/
GeoAI
Advances in generative AI are making synthetic satellite imagery increasingly plausible
Real
Fake
Real
Real
Fake
Fake
Source: Ziulu, Virginia, and James Garforth. "Advancing Deepfake Detection in RGB Satellite Imagery Through Domain-Specific Ensembles“.
IGARSS 2025-2025 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2025.
GeoAI introduces distinct epistemic and transparency risks
Responsible use requires stronger validation frameworks
Accuracy metrics are only the first layer of GeoAI validation, not the endpoint.
Data & model validation
→ Is the underlying data and model performance reliable?
(dataset documentation, provenance, accuracy metrics, F1, IoU)
Robustness/generalization testing
→ Are results stable under distribution shift?
(temporal transfer, spatial transfer, sensitivity tests)
Context/geographic validation
→ Are outputs valid across places and populations?
(bias checks, subgroup analysis, cross-region comparison)
Decision validation
→ Are outputs trustworthy for real-world decisions?
(expert judgment, triangulation, interpretability, domain knowledge)
GeoAI introduces distinct epistemic and transparency risks
Responsible use requires stronger validation frameworks
Accuracy metrics are only the first layer of GeoAI validation, not the endpoint.
Data & model validation
→ Does the model correctly segment vegetation in known benchmarks?
(IoU, accuracy, pretrained model performance)
Robustness/generalization testing
→ Does segmentation remain stable under visual variation?
(lighting, seasonality, image quality)
Context/geographic validation
→ Does “greenery” behave consistently in Tirana vs training cities? (urban morphology, vegetation types, informal greenery)
Decision validation
→ Are greenery indicators meaningful for urban evaluation in Tirana? (urban planning relevance, local interpretation, expert judgment)
Thank You.