1 of 44

Rethinking the grid�Towards less distorted imagery and AI

Daniel Loos, Gregory Duveiller, Fabian Gans

Max Planck Institute for Biogeochemistry�Jena, Germany

Daniel Loos: Rethinking the grid

2 of 44

Maps distort our view on the world

Daniel Loos: Rethinking the grid

2

https://www.businessinsider.com/greenland-africa-comparison-2014-5

3 of 44

Trade-off: Projections preserve either area or shape

Daniel Loos: Rethinking the grid

3

https://en.wikipedia.org/wiki/Tissot%27s_indicatrix

Equal area�Behrmann projection

Equal angle�Mercator projection

4 of 44

Distortions matter at global scale

Daniel Loos: Rethinking the grid

4

models learn patterns from distorted images

5 of 44

Solution: local projections

Daniel Loos: Rethinking the grid

5

https://nsidc.org/data/user-resources/help-center/guide-nsidcs-polar-stereographic-projection

NSIDC's Polar Stereographic Projection�optimized for sea ice applications

Pro: less distortions�con: can’t see the rest of the earth

6 of 44

Solution: Multiple projections

  • Universal Transverse Mercator coordinate system (UTM)
  • since early 40ths
  • 60 local projections (zones)
  • MGRS: 110km grid

Daniel Loos: Rethinking the grid

6

https://natural-resources.canada.ca/​

7 of 44

Problem: Overlap using rectangular tiles

Daniel Loos: Rethinking the grid

7

Bauer-Marschallinger and Falkner (2023)​

UTM zones are not rectangular,�but equal area tiles are�⇓�Overlap 

8 of 44

Wasting petabytes in UTM based satellite products

Daniel Loos: Rethinking the grid

8

33% of land data is duplicated

to be stored, downloaded,�and processed

Sentinel 2

Bauer-Marschallinger and Falkner (2023)​

9 of 44

Less overlap but higher shape distortion in Landsat

Daniel Loos: Rethinking the grid

9

Japelaghi et al 2022�https://www.earthdata.nasa.gov/esds/harmonized-landsat-sentinel-2

UTM based (Senitnel2)

WRS-2 (Landsat)

10 of 44

Daniel Loos: Rethinking the grid

Analog paper maps force us to make 2D mapsDigital data structures are more flexible

11 of 44

Globes as polyhedrons

Daniel Loos: Rethinking the grid

11

Chong Fat (Wikimedia), tukluk.eu

12 of 44

Examples of grids using polyhedrons

Daniel Loos: Rethinking the grid

12

ICON (DWD, MPI-MET)

GraphCast (Google)�AIFS (ECMWF)

13 of 44

Adaption needed for earth observation data

Daniel Loos: Rethinking the grid

13

Modelling

Earth Observation

Bottleneck

Compute

Loading

Projection

no

yes

Resolution

Lower (0.25°)

Higher (10m)

We need to think about

  • Projection
  • Data structures

14 of 44

Daniel Loos: Rethinking the grid

DGGS = Polyhedron + Projection + Index

Efficient data structure for high resolution satellite imagery

Solution: Discrete Global Gridsystems (DGGS)

15 of 44

Not wasting petabytes using a DGGS

DGGS cells may cover multiple polyhedron faces�⇓�Almost no overlap

https://www.uber.com/en-DE/blog/h3/

  • Multiple local projections with almost no data overlap
  • Border cells span multiple polyhedron faces
  • The polyhedron provides just a scaffold for cell placement

16 of 44

ML model training is dominated by data loading and preparation

Daniel Loos: Rethinking the grid

Estimated savings using DGGS (Sentinel 2)

17 of 44

DGGS are standardized by OGC and ISO

Daniel Loos: Rethinking the grid

17

18 of 44

Applications of DGGS

Daniel Loos: Rethinking the grid

18

https://www.esa.int/Applications/Observing_the_Earth/FutureEO/SMOS�Rawson et al. 2022: BIG EARTH DATA 2022, VOL. 6, NO. 3, 294–322�Li et al. 2022: ISPRS International Journal of Geo-Information,11, 627 

ESA Soil Moisture and Ocean Salinity (SMOS) L1c data

Multi-scale Flood mapping

Integration of raster and�vector data

19 of 44

Features of a DGGS

Daniel Loos: Rethinking the grid

19

Polyhedron

    • Tetrahedron
    • Cube
    • Octahedron
    • Dodecahedron
    • Icosahedron

Polygon

    • Triangle
    • Rectangle
    • Hexagon

Aperture

    • 3
    • 4
    • 7

Index

    • hierarchical
    • axial

(x, y)

20 of 44

Available DGGS

Daniel Loos: Rethinking the grid

20

Uber H3

Google S2

DGGRID

Polyhedron

Icosahedron

Cube

Icosahedron

Polygon

Hexagon

Quad

Hexagon, Quad, Triangle

Projection

Gnomonic

Quad. Spherical Cube

Snyder Equal Area

Index

1D, Hierarchical

1D, Hilbert Curve

1D, 2D, 3D

21 of 44

Daniel Loos: Rethinking the grid

Past:

  • Grid definitions
  • Tools to transform geographical coordinates into cell ids�and vice versa

Now:

  • Select an appropriate DGGS
  • Develop tools to build and work with�DGGS native data cubes

Making DGGS native data cubes for global satellite imagery

22 of 44

DGGS native data cubes

Daniel Loos: Rethinking the grid

22

Kmoch et al. 2022

Traditional data cube

Longitude

Latitude

Time

Band

DGGS native data cube

Cell ID

Time

Band

Analyze data in low distorted DGGS space

23 of 44

Least distortions with Hexagons and Snyder Equal Area Projection

Daniel Loos: Rethinking the grid

23

Kmoch et al. 2022

Area distortion

Shape distortion

Snyder Equal Area

24 of 44

Hexagonal cells represents neighbours best

Daniel Loos: Rethinking the grid

24

  • More uniform representation of fluxes
  • More uniform distances inside a bounding box

25 of 44

Hexagonal group convolution�for pattern recognition independent of rotation

Daniel Loos: Rethinking the grid

25

26 of 44

Hexagons in nature

Daniel Loos: Rethinking the grid

26

Shebeko / Shutterstock, NASA, Picfair

27 of 44

Daniel Loos: Rethinking the grid

Rectangular grids:

 Storing longitude and latitude in rows and columns

How about hexagonal DGGS grids?

How to store a hexagonal grid?

28 of 44

DGGS cell indices

Daniel Loos: Rethinking the grid

28

Multidimensional index�

Encode distance to an origin cell

1D index��Encode parent cell in the child cell

29 of 44

Multidimensional index is fastest in bounding box queries

Daniel Loos: Rethinking the grid

29

Add new cell ids

Neighbour search

30 of 44

Array storage is most efficient for axial indices

Daniel Loos: Rethinking the grid

30

  • Data can be stored in n-dimensional arrays (tensors, data cubes) without much gaps
  • Coordinates can be derived from row and column number

31 of 44

DGGRID Q2DI index: Storing 20 faces into 10 matrices

Daniel Loos: Rethinking the grid

31

j coord

i coord 

n=1

32 of 44

Accessing hexagonal neighbours in rectangular arrays

Daniel Loos: Rethinking the grid

32

Array

Lazy padding

j

i

stride

Lazy padding

Surface

33 of 44

Selected DGGS: DGGRID ISEA4H Q2DI

Daniel Loos: Rethinking the grid

33

Feature

Selected value

Reason

Polyhedron

Icosahedron

  • lowest spatial distortion

Polygon

Hexagon

  • lowest shape distortion
  • equal neighbours
  • fast group convolution

Projection

Snyder Equal Area

  • lowest area distortion

Aperture

4

  • efficient multi resolution pyramids
  • efficient array storage

Index

Q2DI

  • efficient array storage

34 of 44

DGGS.jl: A Julia package for DGGS native data cubes

Daniel Loos: Rethinking the grid

34

https://github.com/danlooo/DGGS.jl

using DGGS�p1 = open_dggs_pyramid("https://s3.bgc-jena.mpg.de:9000/dggs/datasets/example-ccsm3")

DGGSPyramidDGGS: DGGRID ISEA4H Q2DI ⬢Levels: Integer[5, 4, 6, 2, 3]�Non spatial axes:  Time CFTime.DateTimeNoLeap�  plev Float64�Variables:  tas air_temperature (:Time) K Union{Missing, Float32} aggregated�  ua eastward_wind (:plev, :Time) m s-1 Union{Missing, Float32} aggregated�  pr precipitation_flux (:Time) kg m-2 s-1 Union{Missing, Float32} aggregated   area meter2 Union{Missing, Float32} 

35 of 44

DGGS.jl: Features

Daniel Loos: Rethinking the grid

35

Function name

Description

transform_points

Convert between geographical coordinates and cell ids

to_geo_array

Convert DGGS array to a raster

to_dggs_array

Convert raster data into a DGGS array

to_dggs_pyramid

Create lower spatial resolutions of a DGGS array

getindex

Select points and disks

write_dggs_pyramid

Write DGGS array to Zarr

plot

Plot the DGGS data cube as a globe using Makie

36 of 44

DGGS.jl: Data access

Daniel Loos: Rethinking the grid

36

Name

Description

Command

Pyramid

All data within the same DGGS grid

p1

Layer

All variables at a given spatial resolution

l = p1[level = 6]

Array

One variable at a given spatial resolution.�May contain other dimensions, e.g. time

a = p1[level = 6, id=:tas, time=1]

Cell

One variable at one point

a[11.586, 50.927]

Disk

One variable at one point and its 6 neighbors

a[11.586, 50.927, 1:2]

Disks are equal distance �bounding boxes in a DGGS

1-disk

2-disk

37 of 44

DGGS.jl: Documentation

Daniel Loos: Rethinking the grid

37

https://danlooo.github.io/DGGS.jl

38 of 44

DGGS.jl: Globe plot

Daniel Loos: Rethinking the grid

38

39 of 44

DGGS.jl: Map plot

Daniel Loos: Rethinking the grid

39

40 of 44

DGGS.jl: Native plot

Daniel Loos: Rethinking the grid

40

No regridding to lat/lon required

41 of 44

Zarr to store DGGS native data cubes

Daniel Loos: Rethinking the grid

41

  • RAM
  • Filesystem
  • HTTP
  • S3

stars

42 of 44

Pangeo xarray package XDGGS

Daniel Loos: Rethinking the grid

42

  • Supports rHEALPix and H3
  • Only 1D indices
  • Currently in development

https://github.com/xarray-contrib/xdggs

import xdggs�import xarray as xr�import urllib.request

# open xarray�urllib.request.urlretrieve("https://zenodo.org/records/10075001/files/healpix_nolotation.nc?download=1", "healpix.nc")�ds = xr.open_dataset("healpix.nc")

# Add DGGS data�ds = (  ds.load()�  .drop_vars(["latitude", "longitude"])�  .stack(cell=["x", "y"], create_index=False)�)

ds.cell_ids.attrs = {�  "grid_name": "healpix",�  "nside": 4096,�  "nest": True,�}

ds = ds.set_xindex("cell_ids", xdggs.HealpixIndex, nside=4096, nest=True)

ds.sel(cell_ids=[11320973, 11320975]) # access using cell ds.dggs.sel_latlon([48.0, 48.1], -5.0) # access using lat/lon coords

43 of 44

Daniel Loos: Rethinking the grid

Discrete Global Grid System (DGGS)with an multidimensional index and hexagonal cells

  • Least distortion
  • Accurate representation of directions
  • Efficient data cube storage (no overlaps, array)
  • Fast Bounding Box retrieval
  • Fast Group Convolutions

Summary

44 of 44

Thank you!

Daniel Loos�dloos@bgc-jena.mpg.de

Max-Planck-Institute for Biogeochemistry

Hans-Knöll-Straße 1007745 JenaGermany

bgc-jena.mpg.de

This project has received funding from the Open-Earth-Monitor Cyberinfrastructure project that is part of European Union's Horizon Europe research and innovation programme under grant agreement No. 101059548.

Daniel Loos: Rethinking the grid