1 of 15

Current State of Cloud-Native Geospatial Formats

Brianna R. Pagán

NASA Goddard Earth Services (GES) �Data and Information Services Center (DISC)�Adnet Systems

GESDISC

2 of 15

Motivation

EOSDIS Archive Growth Projections

This volume of data will make it impractical for researchers to download the data if they want to do timely analyses.

GESDISC

3 of 15

Beyond EOSDIS

Upcoming SWOT mission

  • Produces 20 TB every day
  • Requires 20 laptops with 1 TB storage
  • With a 25 mb/s download speed, each laptop four days to download

McCabe et al., 2017

and Satellites

GESDISC

4 of 15

Definitions

  • Cloud-Native Geospatial: �Geospatial standards and software built for the cloud from the ground up
  • Cloud-Native Geospatial Principles (from Chris Holmes):
    • Read Oriented
      • Performance (Parallel, Partial)
      • Convenience
      • Compatibility
      • Compression
    • HTTP S3/GCS/AZ interfaces
    • Open Source
    • Multiscale Metadata

GESDISC

5 of 15

Cloud-Native Geospatial Formats

Raster

  • Snapshot in time gridded data: DEMs or multi-band single date satellite acquisition
  • GeoTiFFs (.geotiff,.tiff)�
  • Cloud Optimized GeoTIFFs (COGs), OGC standards
  • Additional restrictions to perform in cloud environments (Tiles and Overviews)

band

chunking

GESDISC

6 of 15

Cloud-Native Geospatial Formats

Raster

  • Snapshot in time gridded data: DEMs or multi-band single date satellite acquisition
  • GeoTiFFs (.geotiff,.tiff)�

band

Desired selection and corresponding chunks

  • Cloud Optimized GeoTIFFs (COGs), OGC standards
  • Additional restrictions to perform in cloud environments (Tiles and Overviews)

GESDISC

7 of 15

Cloud-Native Geospatial Formats

Raster

  • Snapshot in time gridded data: DEMs or multi-band single date satellite acquisition
  • GeoTiFFs (.geotiff,.tiff)�

band

  • Cloud Optimized GeoTIFFs (COGs), OGC standards
  • Additional restrictions to perform in cloud environments (Tiles and Overviews)

GESDISC

8 of 15

Cloud-Native Geospatial Formats

Raster

  • Snapshot in time gridded data: DEMs or multi-band single date satellite acquisition
  • GeoTiFFs (.geotiff,.tiff)�
  • Cloud Optimized GeoTIFFs (COGs), OGC standards
  • Additional restrictions to perform in cloud environments (Tiles and Overviews)
  • A true COG should be made available on a server supporting HTTP range requests�

band

GESDISC

9 of 15

Cloud-Native Geospatial Formats

Multi-Dimensional Raster

  • Data cubes, typically in time: weather forecasts
  • NetCDF (.nc) /HDF (.hdf)�
  • Zarr is open-source specification for storing chunked, compressed, N-dimensional arrays
  • Ongoing efforts for standardization via OGC
  • GeoZarr Spec: https://github.com/christophenoel/geozarr-spec

time

band

GESDISC

10 of 15

Cloud-Native Geospatial Formats

Vector

  • Columnar data including lines and polygons
  • Shapefiles (.shp)�
  • GeoParquet, defines how to store vector data in Apache Parquet
  • Supports efficient filtering of chunks based on column stats
  • Ability to partition data into different files
  • Will enable spatial indexing

GESDISC

11 of 15

Cloud-Native Geospatial Formats

Point-Cloud

  • Set of data points in space, LiDAR measurements
  • Polygon File Format, .ply or .xyz�
  • COGs for point clouds
  • LASZip ubiquitous geospatial cloud format

GESDISC

12 of 15

Which Format and When

  • COG vs Zarr is the wrong framing
  • There is no “ideal cloud format” - Howard Butler @Hobu dev for PDAL/COPC
    • “Rather they are extensions of formats which allow spatially accelerated reading over HTTP”

GESDISC

13 of 15

Which Format and When: GES DISC Example

  • What is my use case?
  • What are my limiting factors? (local storage? egress costs?)

GESDISC

14 of 15

Grounding

meme used with permission from: Dr. Julius Busecke, Associate Scientist in the �Climate Data Science Laboratory

GESDISC

15 of 15

Thank you!

“The danger of user centered design is that it releases the designer of the responsibility for having a vision of the world… �The user sees the world as it is. �Our job as builders is to create the world as it could be.

GESDISC