1 of 25

Carbon Neutrality, Sustainability, and HPC

Daniel Reed, University of Utah (moderator)

Robert Bunger, Schneider Electric

Andrew Chien, University of Chicago

Nicolas Dubé, Hewlett Packard Enterprise

Esa Heiskanen, Finnish IT Center for Science

Genna Waldvogel, Los Alamos National Laboratory

1

2 of 25

How did we get here …

  • HPC systems continue growing to deliver performance
    • The “free lunch” of Moore’s Law is over
  • Thermal Design Power (TDP) keeps growing
    • The speed of light is slow and chiplet packages are dense
  • Power Usage Effectiveness (PUE) became really important
    • Cooling efficiency is a very real cost at scale
  • Water is even more precious and site constrained

  • Total energy and cooling requirements keep rising
    • Energy is now a substantial OPEX cost
    • The “AI Boom” is now a rising energy consumer
    • The environmental issues are very real

2

1/15/2024

Based on data from M. Horowitz, F. Labonte, O. Shacham, K. Olukoton, L. Hammond and C. Batten

3 of 25

… and what do we do?

  • What R&D is needed to shift the dynamics around computing sustainability?
  • What are our biggest constraints – economics, policy, or technology?
  • How do we measure sustainability?  (By analogy with total cost of ownership, where do we draw the bounding box and what do we count?)
    • Design, manufacturing, assembly, operation, disassembly, recycling, …
  • How do we reconcile performance desires with sustainability?
  • How do we balance CAPEX, OPEX, and timescales?
  • How do we quantify (via metrics) and manage our carbon footprint?
    • Water, greenhouse gases, energy, waste, …

3

4 of 25

Environmental Sustainability Metrics for Data Centers

Robert Bunger

Schneider Electric

4

1/15/2024

5 of 25

5

1/15/2024

The five key areas of impact

Data Centers consume 1 - 2% �of global energy

Energy

Scope 1, 2 and 3 emissions have �direct impact on climate change

GHG emissions

Data center cooling systems and power plants use significant amounts of water

Water

Waste

Waste is generated during construction and operations

Local ecosystem

Data center facilities and upstream value chain have impact �on the ecosystem

6 of 25

6

1/15/2024

Beginning

Starting the journey

6 of 28 metrics

Advanced

Delivery significant individual impact

18 of 28 metrics

Leading

Reshape industry toward net-zero

28 of 28 metrics

The journey to holistic environmental sustainability

7 of 25

7

1/15/2024

Metric Categories

Key metrics

Units

Recommendations

Beginning

Advanced

Leading

Energy

Total energy consumption

kWh

Power usage effectiveness (PUE)

Ratio

Total renewable energy consumption

kWh

Renewable energy factor (REF)

Ratio

Energy reuse factor (ERF)

Ratio

Server Utilization (ITEU)

%

Metrics to measure energy

8 of 25

8

1/15/2024

Metric Categories

Key metrics

Units

Recommendations

Beginning

Advanced

Leading

GHG emissions

Scope 1

          GHG Emissions

mtCO2e

Scope 2

Location-based GHG emissions

mtCO2e

Market-based GHG emissions

mtCO2e

Scope 3

mtCO2e

GHG Emissions

mtCO2e

Carbon usage effectiveness (CUE)

mtCO2e/kWh

Total carbon offsets

mtCO2e

Hourly renewable supply and consumption matching

%

Metrics to measure greenhouse gas emissions

9 of 25

9

1/15/2024

Metric Categories

Key metrics

Units

Recommendations

Beginning

Advanced

Leading

Water

Total site water usage

m3

Total source energy water usage

m3

Water usage effectiveness (WUE)

m3/kWh

Water replenishment

m3

Total water use in supply chain

m3

Metrics to measure water usage

10 of 25

10

1/15/2024

Metric Categories

Key metrics

Units

Recommendations

Beginning

Advanced

Leading

Waste

Waste generated

Total waste

Metric ton

          E-waste

Metric ton

          Battery

Metric ton

Waste diversion rate

Total waste

Ratio

          E-waste

Ratio

          Battery

Ratio

Metrics to measure waste

11 of 25

11

1/15/2024

Metric Categories

Key metrics

Units

Recommendations

Beginning

Advanced

Leading

Local ecosystem

Land

          Total land use

m2

          Land-use intensity

kW/m2

Outdoor noise

dB(A)

Mean species abundance (MSA)

MSA/km2

Metrics to measure local ecosystem

12 of 25

Opportunities in �Computing on Variable Capacity Resources

Andrew A Chien

University of Chicago and Argonne National Laboratory

Adaptive Capacity Computing BoF at SC23

November 14, 2023

12

11/14/23

13 of 25

Large Scale Variation is Coming

  • It’s an Economic Imperative (affordable high-performance computing)

  • It’s a Climate Imperative (non-environment destroying high-performance computing)

13

11/14/23

14 of 25

Opportunities in Many Areas

  • Scheduling for Variable Capacity
  • Application-Job Adaptation for Variable Capacity
  • Application design for Variable Capacity
  • Intelligent HPC Datacenter Control for X
    • profit, sustainability, power grid, higher HPC capability!

14

11/14/23

Applications

Jobs

Resources

Datacenter Control

Compute

Platforms

15 of 25

Canonical Formulation of Variable Capacity

  • Dynamic Range
    • [Min, Max] Capacity
  • Variability Structure
    • Random Walk, Random Uniform
  • Change Frequency
    • Period of Capacity Change

15

11/14/23

Zhang and Chien, Scheduling Challenges for Variable Capacity Resources, JSSPP 2021.

16 of 25

Workshop on Scheduling for Variable Capacity Resources

  • 20 talks and panels with leading researchers
  • Anne Benoit, Andrew A. Chien, Yves Robert, Report from Workshop on “Scheduling Variable Capacity Resources for Sustainability”, March 2023, Paris France.

16

11/14/23

17 of 25

Research Topics I

17

11/14/23

18 of 25

Research Topics II

18

11/14/23

19 of 25

Carbon Footprint of Generative AI

Nicolas Dubé, Ph.D.

HPE Senior Fellow and Chief Architect

Senior Vice President for HPC & AI Cloud Services

November 14th, 2023

20 of 25

Factors and megatrends: AI is now a supercomputing problem

  • 20

Pre-deep learning (1960–2009)

Deep learning (’10–’16)

Foundation models (’16–)

Source: https://epochai.org.org

One week of ORNL Frontier supercomputer �(8 AI EXAFLOP/s, 40k GPUs)

One week of iPhone 14 �(2 AI TFLOP/s, 1 GPU)

One week of accelerated server�(6 AI PFLOP/s, 8 GPUs)

21 of 25

Let’s find another image for Large language models. I liked this one on google (see cut/paste below) but it’s obviously not our photo. Can we find something more like this though? Chat GPT is a well known LLM so something that represents that.

“Training GPT-3, for example, which has 175 billion parameters, consumed 1,287 megawatt hours of electricity and generated 552 tons of carbon dioxide”

Patterson & al., Carbon Emissions and Large Neural Network Training

https://arxiv.org/ftp/arxiv/papers/2104/2104.10350.pdf

22 of 25

CO2 Emissions per kWh of electricity production

23 of 25

The Path to Sustainable Supercomputing

  • > 100 MW site in Canada
  • 100% renewable grid
  • Free cooling year round
  • “waste heat” re-capture

  • 23

24 of 25

How many tomatoes can ChatGPT grow?

  • 24
  • Typical 500 m2 Greenhouse in Quebec, Canada
    • 5549 climate hours below 10ºC
    • ~1000 Gigajoules per heating season

Natural Gas

Electricity

Volume required: 0.0372 GJ/m3 => 26 881 m3

0.0036 GJ/kwh => 277 778 kWh

Combustion efficiency: 80% => 33 602 m3

Efficiency: 100% => 277 778 kWh

Cost: $0.38/m3 => $12,768

$0.05/kWh => $13,888

2 kg of CO2 / m3 => 67 tons of CO2 / year

277 778 kWh / 5549 h = 50kW

10 MW datacenter @ 50% heat re-cycle efficiency = 100 Greenhouses and 6,720 tons of CO2 offset

  • Tomatoes production math
    • 75 kg per m2, per year ( !!! )
    • 85% greenhouse area for production
    • 4.6 Greenhouses * 500m2 * 85% = 1,969 production m2
    • 1,969 * 75 kg = 147,677 kg per year

  • Natural gas combustion carbon offset (in addition to the 552 tons!)
    • 0.0372 GJ/m3 => 38,402 m3 needed per greenhouse (70% combustion)
    • 2.2kg of CO2eq / m3 => 85 Tons of CO2eq per greenhouse
    • 4.6 Greenhouses => 391 tons of CO2 (or 85 cars annual carbon footprint)

=> 1,033,738 tomatoes !!!

  • ChatGPT training of 1,287 MWh
    • = 4,633 Gigajoules or 4.6 Greenhouses (for 1 year)

25 of 25

Thank you

nicdube@hpe.com

© 2024 Hewlett Packard Enterprise Development LP