1 of 36

Carsten Ehbrecht and Martin Schupfner

12th November 2021

GitHub Link

C3S_34e

Subsetting and Regridding

Climate Change

2 of 36

Climate Change

3 of 36

Climate Change

4 of 36

Carsten Ehbrecht and Martin Schupfner

12th November 2021

6th September 2024 GitHub Link

C3S_34e News from Copernicus

Subsetting and Regridding

Climate Change

5 of 36

Copernicus Climate Data Store

https://cds.climate.copernicus.eu/

Climate

Change

6 of 36

Copernicus Climate Data Store

Climate

Change

7 of 36

Copernicus Climate Data Store: Datasets by Product Type

Climate

Change

8 of 36

Copernicus Climate Data Store: Download Datasets

Climate

Change

9 of 36

Copernicus Climate Data Store: CDS API

Climate

Change

10 of 36

Copernicus Climate Data Store: Rook WPS

  • CMIP6 data is retrieved by CDS from a synced data pool at DKRZ, CEDA and IPSL
  • Using the Rook subsetting service
    • https://github.com/roocs/rook
  • Provides operators based on xarray: subset, regrid, …
  • Operators are implemented by Python library clisops
  • https://pypi.org/project/clisops/

CLISOPS

Climate

Change

11 of 36

clisops

Subsetting

(Jupyter Notebook)

Climate

Change

12 of 36

clisops

Regridding functionalities

powered by xESMF

(Jupyter Notebook)

Remapping, illustration.

Climate

Change

13 of 36

Outline

  • A Zoo of grids
  • Remapping Methods
  • Remapping Tools
  • Constraints & Requirements
  • xESMF Regridder
  • Problems
  • clisops + xESMF

Climate

Change

14 of 36

A Zoo of Grids

Schematic of a Discrete Climate Model Grid, from Earth Magazine and Kotamarthi et al. [2021]

https://utcdw.physics.utoronto.ca/UTCDW_Guidebook/Chapter2/section2.2_climate_modeling.html

Climate

Change

15 of 36

A Zoo of Grids

Climate model experiments are simulated on various grids

  • structured grids - basically quadrilateral grids
    • lat-lon grid, gaussian grid (most atmosphere grids)
    • curvilinear grid (most regional and ocean grids)

  • unstructured grids - anything else
    • triangular grids (eg. ICON-ESM)
    • any kind of combination of polygons (eg. AWI-ESM)

→ connectivity information required (not provided in CMIP)

Overview of grids that ocean output was submitted on (CMIP6):

https://c6dreq.dkrz.de/files/grids.php

Climate

Change

16 of 36

Remapping Methods

Regridding method

Preferably applied for

Bilinear

Smoothly varying variables

(eg. air temperature)

Conservative / Area weighted

Upscaling,

discontinuous variables, fluxes

Patch recovery (least square fit over patch of cells)

Accurate computation of derivatives

Nearest neighbour

Categorical data

(eg. land surface type)

Remapping is mostly necessary when comparing data from different sources.

Climate

Change

17 of 36

Remapping Tools

Climate

Change

18 of 36

General Constraints & ECMWF Requirements

  • One-way regridding to selected regular lat/lon grids,

eg. the grids used for the IPCC Atlas

  • Supporting multiple regridding schemes
    • depending on the variable, model, source grid, …

→ to be selected by the user

  • No extrapolation
  • Proper handling of fill values (i.e. masks)
  • Within the roocs framework (dachar, daops, clisops):
    • preferably with python interface and xarray compatibility
    • no implicit actions when chaining operations (eg. subsetting + remapping) → user’s responsibility
  • Vertical interpolation (secondary priority)

No extrapolation to avoid unscientific/unrealistic results.

Climate

Change

19 of 36

xESMF

xESMF - ESMF/ESMPy regridding capabilities

applied on xarray.Datasets

Remapping in two steps (common to all Regridders):

  • Create regridding weights with ESMF/ESMPy (→ which source grid cells contribute how much to which target grid cells):

n x m source grid → (p*o) x (n*m) weight matrix

p x o target grid

Weight matrices are quite sparse (a target cell value depends only on few source grid cells) and therefore can be efficiently stored by neglecting 0-values.

  • Regridding: Apply the weights on the source data by matrix multiplication.

Climate

Change

20 of 36

Problems - Masking

Unmapped (out-of-source-domain)

grid cells

Target grid cells with no contribution from

a source grid cell will be masked.

This does not work for “nearest neighbour” remapping (by definition - there is always a nearest neighbour):

Here, such a mask would have to be generated manually (example):

  • Define a maximum distance for contributing cells
  • Infer mask from other remapping methods (eg. bilinear)
  • Set values at domain bounds to NaN

Climate

Change

21 of 36

Problems - Masking

Adaptive masking

Target grid cells that lie partly outside of the source domain (or when at least one of the contributing source grid cells is masked) have to be (re-)normalized for many applications.

Adaptive masking allows the user to set a maximum contribution threshold of unmapped / masked source grid cells to the target grid cell.

Conservative remapping - at the bounds of the original domain cells are partly unmapped - unmapped area “contributes” with value of 0.

Conservative remapping with applied adaptive masking (re-normalization).

Climate

Change

22 of 36

Problems - Masking

Adaptive masking - example: threshold 50 %

Target grid cells that lie partly outside of the source domain (or when at least one of the contributing source grid cells is masked) have to be (re-)normalized for many applications.

Adaptive masking allows the user to set a minimum contribution threshold of source grid cells to the target grid cell.

Source Domain / Grid

Target Domain / Grid

masked

unmasked

and

renormalized

Climate

Change

23 of 36

Problems - Masking

Adaptive masking

Advantages:

  • manually specified

masks during weight

generation are not needed

→ reusage of weights

  • working for all regridding

methods (besides nearest

neighbour where

renormalization

is not applicable).

  • the user can set a threshold

value for the maximum

fraction that masked

source grid cells may

overlap with a target grid cell.

adaptive masking - threshold 1

adaptive masking - threshold 0.8

Climate

Change

24 of 36

Problems - Halos

Grid halos

  • Duplicates of the first/last column(s), row(s) on the opposite side of the grid

  • Masking or renormalisation required for conservative remapping

  • But: the grid descriptions of the halos are often broken

(“degenerated cells”) and no exact copies

  • This and complex grid structures make automated masking / removal impossible

Scatter plot of the latitude and longitude coordinates.

Climate

Change

25 of 36

Problems - Unstructured Grids

Unstructured Grids

Example AWI FESOM

  • 830305 grid cells of various shapes and sizes
  • cells have up to 16 vertices
  • unmapped over land
  • but: no NaN values around the unmapped land, as for ICON grids

Scatter plot of the latitude and longitude coordinates.

AWI FESOM data remapped with nearest neighbour method.

Climate

Change

26 of 36

Problems

Further problems

  • Data that would not have been published with proper QC: Fixes and workarounds necessary

Each vector component has to be remapped individually - problematic for grids where the vector components do not represent geographic N-S and E-W

  • Staggered grids vs collocated grids:

On a staggered grid the scalar variables (pressure, density, total enthalpy etc.) are stored in the cell centers of the control volumes, whereas the velocity or

momentum variables are located at the cell faces. Allows easy calculation of derivatives of eg. wind velocity at the cell centers.

Staggered grid (Arakawa C-Grid), Delandmeter and van Sebille (2019)

Climate

Change

27 of 36

Outlook

  • Implement nearest neighbour “unmapped” cells masking within clisops or xESMF (Application 1, Application 2)

  • Support of unstructured grids (“meshes”) in xESMF

  • Central database of regridding weights with web interface, synchronization with local weights cache

  • Potentially support further tools as regridding backend

Climate

Change

28 of 36

Material

Today’s TGIF GitHub Repository

https://github.com/sol1105/tgif_copernicus_clisops_21-11-12

Jupyter Notebook (NBViewer link)

clisops RtD - Regridding section

Feel free to take a look the provided notebooks

or use the binder link to try it out yourself!

Climate

Change

29 of 36

Copernicus Climate Data Store

Beta - available 26.09.2024

https://cds-beta.climate.copernicus.eu/

Climate

Change

30 of 36

Copernicus Climate Data Store - Beta

EarthKit by ECMWF

Python API

Climate

Change

31 of 36

Copernicus Climate Data Store - Beta

STAC Catalog

Python tool to download data

Datasets download

Climate

Change

32 of 36

Copernicus Climate Data Store - Beta

  • Portal to download datasets
  • Python API to download datasets

  • Removed Toolbox Editor

  • STAC Catalog - but collection level only (like CMIP6, CMIP5, …)
  • Improved backend
  • EarthKit Python library - operators like subsetting, regridding

Planned to support Notebooks to access Datasets.

But … search API with STAC for Datasets in missing.

Climate

Change

33 of 36

… thanks for your attention!

If you made it this far …

Climate

Change

34 of 36

Copernicus quality checks

for CMIP5, CMIP6 and CORDEX

Climate

Change

35 of 36

Copernicus Quality Checks - Workflow

Quality Checks

ESGF

CDS

WPS

Climate

Change

36 of 36

Copernicus Quality Checks

  • Performed by CEDA and IPSL
  • QCed subset of CMIP5, CMIP6, CORDEX
  • checks on metadata:
    • vocabulary
    • filenames
    • dimensions
  • checks on the data:
    • only zeros
    • outliers
  • Tools:
    • prePare
    • QA-DKRZ
    • nctime
    • cf-checker

Climate

Change