CEEMS: A Resource Manager Agnostic Energy & Emissions Monitoring Stack
Mahendra Paipuri
Research Engineer
CNRS, France
11/17/2024
1
CNRS in numbers
IDRIS - National HPC Center of CNRS
11/17/2024
2
Context
Compute Energy & Emissions Monitoring Stack (CEEMS)
IEA 2024
11/17/2024
3
CEEMS
Control Groups (cgroups) provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour.
11/17/2024
4
CEEMS Architecture
11/17/2024
5
Technical details
11/17/2024
6
User’s Perspective
Job CPU Stats
11/17/2024
7
User’s Perspective
Job CPU Perf Stats
11/17/2024
8
User’s Perspective
Job GPU Stats
11/17/2024
9
User’s Perspective
Job GPU Perf Stats
11/17/2024
10
Operator’s Perspective
11/17/2024
11
Profiling
CEEMS supports Grafana Alloy and Pyroscope
11/17/2024
12
Continuous Profiling of SLURM jobs
11/17/2024
13
Continuous Profiling of SLURM jobs
11/17/2024
14
Final Remarks
11/17/2024
15
GitHub Repository
11/17/2024
16