1 of 13

FOCUS on Contamination: A Noise-Aware Geospatial Learning Framework for PFAS Contamination Mapping�

Jowaria Khan

PhD. Student, Department of Computer Science and Engineering

University of Michigan

Elizabeth Bondi-Kelly

Assistant Professor, Department of CSE

University of Michigan

Alexa Friedman

Environmental Research Scientist

Environmental Working Group

Sydney Evans

Environmental Research Scientist

Environmental Working Group

David Andrews

Environmental Research Scientist

Environmental Working Group

Kaley Beins

Environmental Research Scientist

Environmental Working Group

Katherine Manz

Assistant Professor, Department of Public Health

University of Michigan

Runzi Wang

Assistant Professor, Department of Human Ecology

University of California, Davis

Rachel Klein

Research Laboratory Specialist, Department of Public Health

University of Michigan

2 of 13

AI for Environmental & Public Health Mapping

1

3 of 13

Core Challenges

Lab-based analysis is costly, time-intensive, and difficult to scale for large spatial datasets.

2

Data gaps

$300

4 of 13

Core Challenges

  • Many contaminants are not visible in satellite imagery (e.g., PFAS, heavy metals).

  • No direct signal for pollution.

➡️ We can’t directly observe this contamination from space

3

5 of 13

Key Idea #1: Proxies

  • Use related signals:

- Land cover

- Proximity to sources

- Hydrology

➡️ Estimate contamination from proxy signals.

4

6 of 13

Key Idea #2: From Sparse Data → Reliable Maps

5

FOCUS: a Geospatial Framework for EnvirOnmental Contamination with Uncertainty Scaling

➡️ Learns from sparse, noisy PFAS samples

7 of 13

Label Noise is Spatially Structured

6

After: Center Pixel (Ground Truth Point) = High Conc.? All Surface Water Area in the Patch = High Conc., Else Low Conc.

Not all pixels are equally reliable

Key Idea #2

Before: Central pixel (Ground Truth Point)

Image Patch

8 of 13

Pixel Confidence Map (Mᵢ)

7

- We compute a confidence score per pixel.

- Based on environmental priors:

    • Distance to PFAS sources
    • Land cover type
    • Distance from sample
    • Downstream connectivity

Ground truth pixel; most certain

Higher confidence areas; relatively certain about the assigned pseudo-label

Lower confidence areas; relatively uncertain about the assigned pseudo-label

Key Idea #2

9 of 13

Noise-Aware Learning (FOCUS Loss)

8

: confidence → trust/reliability

: predicted probability

: focus on hard examples (focal term)

: standard supervision

Learns from pixels that are both difficult and reliable

Theoretical insight: FOCUS optimizes a valid surrogate of the noisy likelihood under pixel-wise label noise

Key Idea #2

10 of 13

FOCUS: Training Workflow

9

11 of 13

FOCUS: Main Results

10

12 of 13

12

From Model to Practice: Web Map Interface

  • Interactive risk maps + uncertainty layers.

  • Query locations / compare years.

  • Guides where to sample next.

11

13 of 13

13

From Challenges to Key Ideas

Key Challenges

  • Sparse, costly sampling → limited coverage.

  • No direct signal in satellite imagery.

Our Key Ideas

  • Use proxy signals (land cover, sources, hydrology).

  • Model spatially varying label reliability.

➡️ FOCUS: combines proxies + uncertainty to learn from noisy supervision

12

Impact

  • Identify under-sampled high-risk regions.
  • Guide targeted data collection.
  • Support environmental decision-making like mitigation.

Future Directions: Active data collection, Physical modeling