1 of 61

June 6, 2025

Inside the CESM Factory:

Model Development and Tuning

Cécile Hannay and Brian Medeiros (CGD/NCAR) �

This material is based upon work supported by the National Center for Atmospheric Research, which is a major facility sponsored by the National Science Foundation under Cooperative Agreement No. 1852977.

2 of 61

Outline

Timeline of building CESM2

The art of tuning

Model development tales

Tuning in the ML era (Brian Medeiros)

3 of 61

Timeline of building CESM2

4 of 61

CESM2: Development of the individual components

Atmosphere

CAM6

Land Ice�CISM2

Land�CLM5

Sea-ice

CICE5

Ocean

POP2

Phase 1: “Let’s build the model components” (5 years)

For CESM2: the effort began around 2010
Individual components were built within each working group

5 of 61

CESM2: Development of the individual components

Phase 1: “Let’s build the model components” (5 years)

�During the building phase, working groups focus on aspects of their model they want to improve

Dynamical core, resolution

Atmosphere

CAM

Physical parameterizations

Many uncoupled �simulations + analysis

6 of 61

CESM2: Coupling of the individual components

Phase 2: “Let’s put it together” (3 years)

Collaborative effort started in Nov 2015
Many meetings with “everybody” � (all working group co-chair/liaisions)
300 configurations
Thousands of simulated years �and diagnostics

CESM2 Release: June 2018

7 of 61

Building CESM2 Timeline

2018

2010

Along the way:

Model Tuning

8 of 61

The Art of Tuning

9 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

10 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

11 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Tuning knobs = parameters weakly constrained by observations

Dcs = Threshold diameter to convert cloud ice particles to snow

12 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Tuning knobs = parameters weakly constrained by observations

Dcs = Threshold diameter to convert cloud ice particles to snow

Cirrus clouds

cloud made up of ice crystals (cloud ice)
altitudes higher 5 km

13 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Tuning knobs = parameters weakly constrained by observations

Dcs = Threshold diameter to convert cloud ice particles to snow

Cirrus clouds

cloud made up of ice crystals (cloud ice)
altitudes higher 5 km

14 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Tuning knobs = parameters weakly constrained by observations

Dcs = Threshold diameter to convert cloud ice particles to snow

Cirrus clouds

cloud made up of ice crystals (cloud ice)
altitudes higher 5 km

big ice crystals fall out of the cloud�=> cloud ice “converts” to snow

15 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Tuning knobs = parameters weakly constrained by observations

Dcs = Threshold diameter to convert cloud ice particles to snow

Cirrus clouds

cloud made up of ice crystals (cloud ice)
altitudes higher 5 km

big ice crystals fall out of the cloud�=> cloud ice “converts” to snow

Dcs = threshold diameter

16 of 61

Model tuning

Dcs = Threshold diameter to convert cloud ice particles to snow

Smaller Dcs

Larger Dcs

Less cloud ice

More cloud ice

What is the impact on climate ?

17 of 61

Model tuning

Dcs = Threshold diameter to convert cloud ice particles to snow

Smaller Dcs

Larger Dcs

Less cloud ice

More cloud ice

More cloud ice => less infrared radiation (IR) go to space

IR

18 of 61

Model tuning

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.

Top of atmosphere radiative balance should be near zero

Adjust Dcs

19 of 61

Model tuning

Why is it so important to tune atmosphere radiative balance ?

Radiative balance (W/m2)

years

T ocean (°C)

years

If the atmosphere radiative balance is positive, the ocean is warming

20 of 61

Model tuning

Other targets when tuning

Cloud forcing
Precipitation
ENSO amplitude
Atlantic Meridional Ocean Circulation (AMOC)
Sea-ice thickness/extent

Top of atmosphere radiative balance should be near zero

21 of 61

Dilemmas while tuning

Subjectivity of tuning targets

Tuning involves choices and compromises� Overall, tuning has limited effect on model skills

Tuning for pre-industrial ⬄ Tuning for present day

Pre-industrial: Radiative equilibrium � Present day: Available observations

Tuning individual components ⬄ Tuning coupled model

Tuning individual components is fast� But no guarantee that results transfer to coupled model

Tuning exercise is very educative

We learn a lot about the model during the tuning phase.

22 of 61

Model development tales

23 of 61

Coupling = Unleashing the Beast

AMIP run

Prescribed SSTs
No drift

Coupled run

Fully active ocean
Coupled bias and feedback

SSTs = Sea Surface Temperatures�AMIP = type of run when SSTs are prescribed

24 of 61

Example of unleashing the beast (1)

Tuning CAM5 (CESM1 development, 2009)

Tuning was done in AMIP mode: looks like “perfect” simulation

Evolution of the SST errors (K)

Mean SST errors (K)

In coupled mode: strong cooling of the North Pacific (bias > 5K)

Courtesy Rich Neale

CAM = Community Atmospheric Model�SST = Sea Surface Temperature

AMIP = type of run when SST are prescribed

25 of 61

Example of unleashing the beast (1)

Colder

SSTs

�More cloud in

North Pacific

Sea-ice �grows

Tuning CAM5 (CESM1 development, 2009)

Tuning was done in AMIP mode: looks like “perfect” simulation

Evolution of the SST errors (K)

In coupled mode: strong cooling of the North Pacific (bias > 5K)

Courtesy Rich Neale

26 of 61

Example of unleashing the beast (2)

Spectral Element dycore development (CESM1.2, 2013)

SSTs stabilize but too cold compared to obs�SST: 0.5K colder than FV

Finite Volume (FV)

Spectral Element (SE)

27 of 61

Example of unleashing the beast (2)

Spectral Element dycore development (CESM1.2, 2013)

In CAM standalone: Finite Volume (FV) and Spectral Element (SE) dycores produces very similar simulations.

SSTs stabilize but too cold compared to obs�SST: 0.5K colder than FV

In coupled mode: SSTs stabilize 0.5K colder with SE dycore

Bias = -0.38K

RMSE = 0.96

FV

SE

SSTs (K)

Years

0.5K

Global Ocean Temperature (°C)

FV

SE

28 of 61

Example of unleashing the beast (2)

Spectral Element dycore development (CESM1.2, 2013)

In CAM standalone: Finite Volume (FV) and Spectral Element (SE) dycores produces very similar simulations.

SSTs stabilize but too cold compared to obs�SST: 0.5K colder than FV

In coupled mode: SSTs stabilize 0.5K colder with SE dycore

Bias = -0.38K

RMSE = 0.96

FV

SE

SSTs (K)

Years

0.5K

Global Ocean Temperature (°C)

FV

SE

29 of 61

What is different (Finite Volume ⬄ Spectral Element) ?

Tuning parameters

Slight difference in Southern Oceans

Surface Stress

	FV	SE
rhminl	0.8925	0.884
rpen	10	5
dust_emis	0.35	0.55

Grid differences at high latitudes

Red: SE grid�Blue: FV grid �(at about 2 degree)

Courtesy: �Peter Lauritzen

New software to generate topography

(accommodate unstructured grids and enforce more physical consistency)

Topography

Remapping differences (ocn ⬄ atm)

State variables: FV uses “bilinear” and SE “native” �

Can you guess?

30 of 61

What is different (Finite Volume ⬄ Spectral Element) ?

Tuning parameters

Slight difference in Southern Oceans

Surface Stress

	FV	SE
rhminl	0.8925	0.884
rpen	10	5
dust_emis	0.35	0.55

Grid differences at high latitudes

Red: SE grid�Blue: FV grid �(at about 2 degree)

Courtesy: �Peter Lauritzen

New software to generate topography

(accommodate unstructured grids and enforce more physical consistency)

Topography

Remapping differences (ocn ⬄ atm)

State variables: FV uses “bilinear” and SE “native” �

Zonal Surface Stress (N/m2)

FV

SE

Max moves North

31 of 61

Mechanism responsible of SST cooling in SE

Wind stress curl anomaly

(year 1-10)

wind stress curl difference at 50S

CORE FV SE

100-m vertical velocity anomaly

upwelling of cold water anomaly at 50S

Changes in location of upwelling zones associated with ocean circulation is responsible of the SST cooling

SST anomaly from CORE

CORE FV SE

Cold SSTs are advected

north by ocean circulation

South

North

Ocean circulation

32 of 61

Similar behavior in GFDL model

CM2.0�Eulerian dycore

CM2.1�FV dycore

SST cooling

warm layer at 750m

Reduced �biases in FV

Zonal wind stress

FV

Eulerian

South

North

33 of 61

Example of unleashing the beast (3)

The Labrador Sea issue (CESM2 development, 2016)

The Labrador Sea was freezing in CESM2_dev.

Sea-ice extent is close to obs.�Labrador sea is ice free

Observed

sea-ice extent

(black line)

CESM1

Labrador sea is ice-covered.

Labrador sea

CESM2_dev

Sea-ice extent

34 of 61

Example of unleashing the beast (3)

CESM1

The Labrador Sea issue (CESM2 development, 2016)

Why was Labrador Sea freezing ?

CESM2_dev

Too warm and salty

Too cold and too fresh

Too cold and too fresh South of Greenland => Labrador Sea freezes

SST bias

Salinity bias

35 of 61

Example of unleashing the beast (3)

CESM1

The Labrador Sea issue (CESM2 development, 2016)

Why was Labrador Sea freezing ?

CESM2_dev

Too warm and salty

Too cold and too fresh

Too cold and too fresh South of Greenland => Labrador Sea freezes

SST bias

Salinity bias

36 of 61

Trouble in the Labrador Sea

Sea-ice extent is close to obs Labrador sea is ice free

(also true for LENS) �

Extensive sea-ice cover�Labrador sea is ice covered

Timeseries of sea ice thickness in Labrador sea

Sea ice is building up in Labrador sea

This can happen after 1 yr, 40 yr, 100⁺ yr

It is was very robust feature in CESM2_dev

Multiple attempts to fix the issue

EBM

Estuary Box Model

37 of 61

Estuary Box Model (EBM) to the rescue!

Sea surface salinity

Sea surface temperature

EBM – CONTROL (COUPLED)

Courtesy: Gokhan Danabasoglu

EBM – CONTROL (COUPLED)

38 of 61

Coupling = Unleashing the Beast

39 of 61

Summary

Building of CESM happens in two phases (building and coupling components)�

�

Phase 2: Let’s couple the components

Phase 1: Let’s build the components

2010

2018

40 of 61

Summary

The Art of Tuning�

Tuning = adjusting parameters (“tuning knobs”) � to achieve best agreement with observations.�

Tuning involves choice and compromise
We learn a lot about the model while tuning�

The Art of Coupling�

Three examples of coupling challenges

CESM1: cold SST bias in North Pacific with CAM5
CESM1.2: SSTs stabilize 0.5K colder with SE dycore
CESM2: Labrador Sea is ice-covered

41 of 61

Thanks !

Cartoons by non artificial intelligence:�Kolya Dols and Vincent Dols

42 of 61

Calibrating ESMs in the ML era

Brian Medeiros & Linnia Hawkins

43 of 61

Challenges of “hand tuning”

Large: number of parameters
Slow: change parameter, run model, analyze results, update parameter value(s), repeat
Complex: difficult to predict/understand/identify parameter interactions
Ambiguous: difficult to define objective “tuning targets” and quantify trade-offs between them

Totally fake picture

44 of 61

Exploring Parameter Sensitivities

Explore model sensitivity through parameter adjustments.
Generate an ensemble with varied parameter values.
Analyze model outputs across the parameter set.
Identify parameters influencing model behavior.
Understand uncertainties within earth system models.

Totally fake picture

45 of 61

CAM6 Perturbed Parameter Ensemble

micro_mg_max_nicons	micro_mg_effi_factor	Micro_mg_vtrmi_factor	Micro_mg_iaccr_factor	micro_mg_berg_eff_factor	micro_mg_accre_enhan_fact
micro_mg_autocon_fact	micro_mg_autocon_nd_exp	micro_mg_autocon_lwp_exp	micro_mg_homog_size	micro_mg_dcs	clubb_c1
clubb_C8	clubb_c11	clubb_c14	clubb_c_K10	clubb_gamma_coef	clubb_C2rt
clubb_wpxp_L_thresh	clubb_beta	clubb_C6rt	clubb_C6rtb	zmconv_c0_lnd	zmconv_c0_ocn
zmconv_ke	zmconv_ke_lnd	zmconv_dmpdz	cldfrc_dp1	cldfrc_dp2	zmconv_num_cin
zmconv_momcu	zmconv_momcd	zmconv_tiedke_add	zmconv_capelmt	seasalt_emis_scale	dust_emis_fact
sol_factb_interstitial	sol_factic_interstitial	microp_aero_wsub_min	microp_aero_wsubi_min	microp_aero_wsub_scale	microp_aero_wsubi_scale
microp_aero_npccn_scale

Eidhammer et al. 2024

46 of 61

CLM Perturbed Parameter Ensembles

Dagon et al. 2020�Kennedy et al. 2024

47 of 61

Calibrating ESMs in the ML era

Emulators are fast surrogates for climate models, mapping parameters to output (usually statistics of model variables)

Emulators can be used in any traditional calibration method (e.g., Markov Chain Monte Carlo [MCMC], approximate Bayesian computation[ABC])

Emulators are also differentiable, so the gradients can be taken advantage of and the calibration doesn’t have to be fully numerical.

48 of 61

Surrogate based calibration

Watson-Parris, et. al., (2021)

Cleary et al., 2021

49 of 61

ESM calibration methods

NN-derived emulator based MCMC (Elsaesser et al., 2025)

Calibrate Emulate Sample (Cleary et al., 2021) - Gaussian Process emulator based on Ensemble Kalman sampling�“The overarching approach is to first use ensemble Kalman sampling (EKS) to calibrate the unknown parameters to fit the data; second, to use the output of the EKS to emulate the parameter-to-data map; third, to sample from an approximate Bayesian posterior distribution in which the parameter-to-data map is replaced by its emulator.”

History Matching (Williamson et al., 2013; Hourdin et al., 2021)(closely related to ABC)

50 of 61

CLM simulates far more processes than are observed

51 of 61

CLM simulates far more processes than are observed

52 of 61

Carbon cycle uncertainty in land model projections

32 parameters in the Community Land Model

500 member perturbed parameter ensemble

Unconstrained parametric uncertainty

Source

Sink

CMIP5: RCP8.5

Friedlingstein et al. (2014)

53 of 61

Constraining parametric uncertainty

Generate a perturbed parameter ensemble

56 parameters

1500 ensemble members

�

Train Emulator

Emulate a dense sample

Rule out implausible parameter sets

Constrain parameter posteriors

54 of 61

Constraining parametric uncertainty

Generate a perturbed parameter ensemble

56 parameters

1500 ensemble members

�

Train Emulator

Emulate a dense sample

Rule out implausible parameter sets

Constrain parameter posteriors

55 of 61

Constraining parametric uncertainty

Generate a perturbed parameter ensemble

56 parameters

1500 ensemble members

�

Train Emulator

Emulate a dense sample

Rule out implausible parameter sets

Constrain parameter posteriors

56 of 61

History Matching

Wave 0: Latin hypercube sample
Wave 1: Leaf area
Wave 2: Biomass, Gross primary productivity, Leaf area

57 of 61

Constraining uncertainty in land carbon sink

Unconstrained parametric uncertainty

Leaf area constraint

Biomass, GPP, leaf area constraints

58 of 61

Calibrating CLM6

Robust emulators trained on PPE can be used for calibration.

Default

Leaf Area Bias

Calibrated

Leaf Area Bias

Mean �absolute

error = 1.36

Mean

absolute

error = 0.60

59 of 61

Unexpected improvements & Structural Error

Elsaesser et al., 2025

60 of 61

Challenges with ML calibration

Methods

Number of parameters
Sampling strategies
Emulation architecture (lots of choices; lots of penalty functions)

Computation

Still need to run the physical model for training
Length of simulations?
Does forcing matter?

Culture

difficult to build confidence in the new method
Adjust to having many plausible parameter settings
Alter analysis approaches

Coupling

Activates feedbacks differently, adds complexity/ambiguity

61 of 61

ESM Calibration Evolved

Emulators offer fast surrogates for complex climate models.
Calibration methods benefit from differentiable emulators and gradients.
Uncertainty is constrained using perturbed parameter ensembles and emulators.
History matching sequentially rules out implausible parameter sets.
Observed variables constrain parametric uncertainty in unobserved ones.