1 of 36

AGC at US CMS Analysis Facilities

Carl Lundstedt�University of Nebraska, Lincoln

AGC Workshop, May 4, 2023

2 of 36

Analysis Facilities

3 of 36

Coffea-casa @ Nebraska

Oksana Shadura, John Thiltges, Garhan Attebury,

Carl Lundstedt, Ken Bloom, Sam Albin, Brian Bockelman

4 of 36

Casa Hardware – Flatiron

–12 Dell R750 Servers, 512 GB Ram, 10 3.2 TiB NVMe Drives� Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz (56 threads/CPU, 2 CPU per node)

– 2 x 100Gbps Networking, Calico + BGP

– Running Alma Linux 8.6 (Sky Tiger)

– Ceph-Rook Filesystem @ 183 TiB

– Single V100 Nvidia GPU

– Ceph + Skyhook @ 8.7 TiB Usable

– Kubernetes (v1.26.2)

– Cert-manager, Dex, External-dns, Sealed-secrets, Traefik,CVMFS

5 of 36

Casa Infrastructure & Management

– Configs for casa are kept in GIT

– Changes follow gitops techniques

– Changes are applied in-situ via a Flux agent

6 of 36

Grid / cluster site resources

Kubernetes resources

Data delivery services - ServiceX

HTCondor scheduler

HTCondor workers

Remote data access

JupyterHub

(shared between users)

Jupyter kernel

Dask workers

Dask scheduler

https://iris-hep.org/projects/coffea-casa.html

Skyhook

XCache

Dask work.

Non-data Communication

Data flow

Browser

Shared resources between users

Per-user resources

Coffea-Casa

7 of 36

Building blocks: Authentication Tools

Jupyterhub allows for a variety of

authentication methods and we inherit this functionality.

Using OAuth we can select an OIDC service to manage users for us. �

Dummy authentication is also useful for spinning up test instances.

Each instance must be registered and secrets about that client have

to be available in the instance. We seal these secrets so we can

store them in git encrypted.

https://cms-auth.web.cern.ch�https://cilogon.org/oauth2/register

8 of 36

The Four Casa Instances

CMS-Prod (https://coffea.casa) Opendata-Prod (https://coffea-opendata.casa) CMS-Dev Opendata-Dev

9 of 36

Workflow Scale Out

Scale out is accomplished with a custom Dask-Jobqueue Class that

deploys Dask worker nodes in either our T2 resource or in an condor cluster running inside the Flatiron kubernetes.

10 of 36

Storage & Data Access

– Each user give 10GB of persistent storage on login.

– XCache via Tokens issued at login.

– cern.ch CVMFS mounted in the pods.

– User's T2 /store/user mounted in the user pod.

11 of 36

Triton Inference Service

– To leverage the presence of our V100 GPU an inference service is deployed in the CMS Dev

instance.

– Training sets are able to be stored in an S3 bucket deployed for just this task.

s3://rook-ceph-rgw-my-store.rook-ceph.svc:80/triton-c9adf042-ffb8-4221-bd42-e385efb1d0e2

12 of 36

ServiceX

13 of 36

14 of 36

15 of 36

16 of 36

17 of 36

18 of 36

19 of 36

20 of 36

FNAL's Elastic Analysis Facility

Burt Holzman – Project lead

Maria Acosta – Technical lead for applications

Chris Bonnaud – Technical lead for infrastructure

Joe Boyd, Glenn Cooper, Lindsey Gray,

Farrukh Khan, Ed Simmonds,Nick Smith,Elise Chavez

21 of 36

22 of 36

Onsite Login

23 of 36

Login: MultiVO Support

24 of 36

Application Ecosystem

25 of 36

26 of 36

Triton

27 of 36

28 of 36

29 of 36

They've built a multi-VO, secure, integrated, Elastic Analysis Facility prototype in compliance with DOE/Lab cybersecurity requirements

Started as a USCMS project but have grown to be a multi-experiment initiative providing services to our experiments and scientists.

• Developed more than 20 environments for experiments with dedicated CVMFS mounts, shared storage and specific scientific software.

• Collaborating with multiple groups across the laboratory as well as industry partners, open source projects and other institutions has allowed us to gain insights on what our users _really_ want and need.

• Strengthened participation with IRIS-HEP and the USCMS collaboration on building next generation analysis facilities in the US

FNAL's Elastic Analysis Facility

30 of 36

MIT's Analysis Facility

Mariarosaria D'Alfonso, Josh Bendavid, Chad Freer, Zhangqier Wang, Luca Lavezzo, Christoph Paus

31 of 36

subMIT

An MIT Physics Department analysis facility.

→ provide ecosystems to many research areas

https://submit.mit.edu

subMIT system provides an interactive login pool + scale-out to batch resources

○ Home directories
○ SSH or Jupyterhub access
○ Convenient software environment (CentOS7 native, docker/singularity images, conda)
○ Local batch system with O(1000) cores, >50 GPU’s
○ Local storage (1TB/user), 10’s of TB for larger group datasets
○ Fast networking: 100 Gbps ethernet
○ Convenient access to larger external resources (OSG, CMS Tier-2 and Tier-3, LQCD Cluster,�EAPS)

32 of 36

subMIT

33 of 36

CMS connection to subMIT

33

Connected to all resources on campus

MIT Campus Factory�Frontend

US CMS T2�Frontend

OSG Frontend

submit.mit.edu

CMS T3_US_MIT

CMS T2_US_MIT

MIT EAPS

CMS T2_US_Y

Various universities�and laboratories

Restricted to�CMS members

Normal priority

Preempted

Campus

HPRCF, Bates

includes CMS Tier-2

CMS Tier-3

EAPS cluster

Earth and Planetary Sciences

Virtual Center

'subMIT'

Campus

FrontEnd

OSG

FrontEnd

CMS

FrontEnd

Open Science Grid – OSG

- plenty of resources across US

CMS Computing

- CMS resources across world

Limited to CMS

CTP,

MKI,

….

34 of 36

Examples of workflows on subMIT from LHC/CMS

Very different analysis requirement

Search for rare decay of the Higgs Boson (CADI HIG-23-005)

largely profit of event size reduction, simple calculations and almost interactive analysis
small final dataset for ML inference, GPU used for training
use of SSD disk, ROOT RDataFrame, correctionlib for syst evaluation

Search for “Soft Unclustered Energy Patterns (SUEPs)” ([CADI: EXO-23-002]:

real time analysis reclustering the “jets”, select SUEP candidates and boost in that frame
heavily relying in the parallelization (batching HTCondor)
Coffea Analysis Framework

Measurement of the W boson mass (CADI SMP-23-002):

challenge in bookkeeping of templates for systematics variation of uncertainty weights for both background and signal, i.e. build O(10^3) replicas of the final histograms
need multithreading and memory-based challenges
need a big machine for now
GPU used for final fit
ROOT RDataFrame

Common features: use the nanoAOD simplified data format as input

34

35 of 36

Deployed Features by Analysis Facility

	Dask support	Batch	Xcache	ServiceX	JH Interface	Mlflow	Triton	GPU support
EAF�	Dask Gateway	HTCondor	x	x	x	-	x	x
� @ UNL	Via HTCondor	HTCondor	x	x	x	x	x	1 GPU
�	Via Slurm	SLURM	?	-	x	-	?	4 GPU per node
	WIP	WIP	?	-	x	-	x	2 GPU

36 of 36

AGC at US CMS Analysis Facilities

Questions?