1 of 11

Advanced ML methods for Natural Hazard Monitoring

Nikolaos Ioannis Bountos

Orion Lab

National Observatory of Athens & National Technical University of Athens

bountos@noa.gr

2 of 11

Exploiting large-scale Remote Sensing data for natural hazard monitoring and forecasting

Vast amount of Remote Sensing data

Great opportunity for hazard monitoring, forecasting and response

e.g:

Rapid flood extent mapping

Wildfire detection and forecasting

Volcanic activity early warning

Drought monitoring and forecasting

Great advances in pattern recognition (Machine and Deep Learning)

Estimated volume of freely available Satellite data by year. Image source [1]

[1] Soille, Pierre, et al. "A versatile data-intensive computing platform for information retrieval from big geospatial data." Future Generation Computer Systems 81 (2018): 30-40.

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

3 of 11

Challenges of modeling natural hazards

Natural hazards are by definition extreme events → Rare
Difficult to acquire a dedicated dataset for each problem

The annotation process typically require expert knowledge → Expensive

Modelling challenges combined with label scarcity:

Civil authorities may require detailed information on hazardous events

Intensity of an event

How much was the uplift at a given volcanic unrest event?

Localized information

Where is the damage caused by a flood or hurricane event most severe?

Nature of an event

What was the underlying mechanism for this volcanic unrest episode

Inferring detailed information from Remote Sensing data often comes to fine-grained image classification

Spatiotemporal generalization becomes way harder with limited data.

E.g It is extremely difficult to predict burned areas in Africa, when using data solely from the Mediterranean for training

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

4 of 11

Long tailed distribution example - Volcanic Activity

Instances of volcanic activity and type of ground deformation from 2014-2021 as recorded in the Hephaestus[2] dataset

[2] Bountos, Nikolaos Ioannis, et al. "Hephaestus: A large scale multitask dataset towards InSAR understanding." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

5 of 11

Example - Hephaestus Dataset

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

6 of 11

Fine-grained image classification in the context of natural hazards

Classes have very subtle differences
Intensity of the event may highlight or completely hide them.
Example:

Ground deformation types caused by volcanic activity

Very difficult for an untrained eye to distinguish.
The patterns are barely visible at lower intensities e.g Dyke
Challenging in low-data regimes

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

7 of 11

How can we work around these challenges with the help of the abundance of unlabeled RS data?

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

8 of 11

Large scale Self-Supervised Learning

Goal:

Exploit the massive amount of global, unlabeled data to unlock the potential of applications with small labeled datasets.
Create models with higher robustness against class imbalance[3]
Improve spatiotemporal generalization with a greater spatiotemporal coverage at training time

[3] Liu, Hong, et al. "Self-supervised learning is more robust to dataset imbalance." arXiv preprint arXiv:2110.05025 (2021)

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

9 of 11

Information restoration as a self-supervised learning paradigm - Masked Autoencoders [4]

Masked tokens are not processed by the encoder → Efficient
Loss is applied only on masked patches
More info on the notebook

Learnable tokens representing the masked patches + encoded visible patches

[4]: He, Kaiming, et al. "Masked autoencoders are scalable vision learners." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

10 of 11

Information restoration as a self-supervised learning paradigm - Masked Autoencoders

Particularly popular in the RS domain with many works building on top of it incorporating domain knowledge e.g:

SatMAE[5]
ScaleMAE[6]
FoMo-Net[7]

Learnable tokens representing the masked patches + encoded visible patches

[5] Cong, Yezhen, et al. "Satmae: Pre-training transformers for temporal and multi-spectral satellite imagery." Advances in Neural Information Processing Systems 35 (2022): 197-211.

[6] Reed, Colorado J., et al. "Scale-mae: A scale-aware masked autoencoder for multiscale geospatial representation learning." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

[7]Bountos, Nikolaos Ioannis, Arthur Ouaknine, and David Rolnick. "FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models." arXiv preprint arXiv:2312.10114 (2023).

IGARSS 2024 | Advanced ML methods for Natural Hazard Monitoring

1 of 11

2 of 11

3 of 11

4 of 11

5 of 11

6 of 11

7 of 11

8 of 11

9 of 11

10 of 11

11 of 11