1 of 11

Prototyping with the Scalable Systems Laboratory

Future Analysis Systems and Facilities

October 26-27, 2020

Supported by National Science Foundation under cooperative agreement OAC-1836650

2 of 11

Some SSL deployments

DOMA::ServiceX	Data transformation and delivery service for LHC analyses (IRIS-HEP)
DOMA::Skyhook	Programmable storage for databases, scaling Postgres with Ceph object store (IRIS-HEP)
REANA	Reusable Analysis Service (CERN development team)
CODAS Platform	JupyterLab notebooks, access to GPU resources on the Pacific Research Platform for annual summer CoDaS-HEP training event (IRIS-HEP SSC area)
Frontier Analytics	Analyze and improve data access patterns for ATLAS Conditions Data (ATLAS Distributed Computing Group)
perfSONAR Analytics	Network route visualization based on perfSONAR traces (NSF SAND project)
Parsl / FuncX	Parallel programming in Python, serverless computing with supercomputers (Computer Science)
Large-Scale Systems Group @ UChicago	Serverless computing with Kubernetes (Computer Science)
SLATE & OSG	Backfilling otherwise unused cycles on SSL with work from the Open Science Grid & ATLAS using the SLATE tools

A diversity of services and developer engagements

3 of 11

Multi-Site K8s Platform

We have a Year 3 milestone in IRIS-HEP to build out a multi-site k8s platform to support "Analysis Systems" and DOMA activities
Recall our blueprint meeting at NYU (June 2019) which identified k8s as good "substrate" technology (summary report)
Goal here is to advance the site-level k8s experiences over the past year to multi-site, which likely will be required for future analysis grand challenges

4 of 11

Some "app" ideas

Support a madminer demonstrator w/ delphes simulation
Support a multi-site ServiceX deployment
Support a multi-site REANA deployment
Support the equiv of a CoDaS-HEP training event, but scalable-multi-site & w/ some scheduling
A topical motivation - support the Snowmass21 MC campaigns? (c.f. https://snowmass21.org/montecarlo/start and:

https://indico.fnal.gov/event/44838/contributions/194953/attachments/133953/165442/Snowmass21_on_OSG_1.pdf
production on OSG - analysis, tbd, would follow: https://indico.fnal.gov/event/44870/contributions/198448/attachments/136004/168905/Stupak_100720_SnowmassEFMCTF.pdf

5 of 11

Clusters (4)

�SSL-UChicago-RIVER-dev: development RIVER cluster

k8s version: v1.17.11
cvmfs: yes
Local filesystem: Rook
Resources:

CPU: 432 cores (9 machines, 48 cores per machine)
Storage: 3.2 TB (rook/ceph)

Pod types available: CPU
Gaining access: same as SSL-UChicago-RIVER

UNL

K8s version 1.18.6
CEPH storage (small, old hardware, ~ 4 TB only)
CMS OAuth enabled access (to the JH resource)
Jupyterhub powered with job scaling to the htcondor pool of our T2
Hosts: cores RAM
red-kube-vm00[1,2,3] masters VMs living on c07[14,16,18].shor 2 8GB
red-kube-c07[24,26,28,30] workers R710s with disks for Rook.io 24 96GB
red-kube-c10[35,36,37] workers Sun X2200 8 32GB
red-kube-c69[21-26] workers Sun X2200 8 24GB
red-kube-c69[27-30] workers 4-in-2 Supermicro 16 64GB
red-kube-c6931 workers 1U Supermicro 8 32GB

PRP Nautilus

k8s version: 1.18.6
Cvmfs: yes
Local filesystem: Rook, SeaweedFS, BeeGFS
Resources:

CPUs: 7000 cores
GPUs: 500+
Storage: 2.5+PB (rook/ceph), 2PB BeeGFS

Pod types available: Any (CPU, GPU, FPGA)
Gaining access: https://ucsd-prp.gitlab.io/userdocs/start/toc-start/
OpenNSA controller L2

UW-Tiger

k8s version: v1.19.1
Resources: constantly expanding, when I can get to it, lots of old, with more old and new on the way

As of Oct 13, 2020

Limited non shared persistent local storage
368 cores (8x 40 old 1x48 new)

Soon:

1200+ more old cores
30 TB rook/ceph
8x 48 new cpu

Slightly later:

More newer cores
1.5 PB rook/ceph

Gaining access: email BrianB and JeffP

6 of 11

Clusters

UIUC-Boneyard

k8s version: 1.18.8
Compute hardware (~400 cores):

22 Dell PowerEdge R410 (mixture of 16/24 cores, 2.54/2.8 GHz, 23.5/49.5 GB)
4 Dell PowerEdge R710 (mixture of 16/24 cores, 2.5/2.7 GHz, 24.7 GB)

Storage hardware (~80 TB)

5 Dell MD1200 (~80 TB)
2 Dell R510 w/12 3.5” drives

Current configuration (subset of total to be brought online, as above)

1 node (boneyard.ncsa.illinois.edu) for login & kubectl operation
1 hardware/OS management node
3 k8s control panel nodes
3 k8 compute nodes
1 storage nodes (w/ two connected enclosures for a total of 30 TB operational)

Successfully deployed funcX service via k8, more testing underway
Pod types available: CPU

SSL-UChicago-RIVER: production RIVER cluster (production SSL cluster)

k8s version: v1.16.7
cvmfs: yes
nearby xcache: xcache.mwt2.org
Local filesystem: Rook
Resources:

CPU: 2784 cores (58 machines, 48 cores per machine)
GPU: none (note the fiona has been moved back to the UChicago cluster)
Storage: 23 TB (rook/ceph)

Pod types available: CPU
Gaining access: contact UC admins

7 of 11

Capabilities (for a prototype)

Simple access & auth
Simple app
Simple scheduling
Simple data
Simple monitoring

What are the primitives �required for devOps of analysis�systems?

topology

8 of 11

Capabilities for Analysis Wish List

Add your wished for capability or expectation here

How much latency to expect?

can resource targets expose this (a sort of QoS label)

FuncX - Matthews talk
Building framework agnostic infrastructure capable of various AD's (Luca's talk)
Madminer/REANA
Instrumentation primitives - sidecars, elastiflow, elasticsearch
Must fit into the infrastructure we Op and Own

at present, they look quite different

But what about the facilities we don't own

9 of 11

Project Mode

Explore and experiment
Build & Operate
Capture blueprints
Metrics & lessons
Teardown

10 of 11

Costs and QoS

From yesterday's whiteboard discussion
How can the infrastructure communicate expected "costs"
Costs can be labels indicating peformance - like QoS
They can also indicate resource consumption
Users will be better equipped to optimize the use of available resources
Already seen this with ServiceX - owing to lack of QoS in a data delivery system (from local and remote storage systems)

11 of 11

Initial Planning & Questions?

Goal is to provide the kubernetes API to systems "user" groups (those developing, deploying and operating the analysis systems)

so any application built for k8s
service deployments and batch workloads (multi-cluster scheduling)

Create a lightweight federation mesh with Admiralty.io (existence proof w/ PRP)
Organizing access

Simple service accounts, created locally and intentional shares
Namespace to namespace federation (source and target)

perhaps namespace for each user group?

Credential management using Admiralty Cloud?
A minimal trust agreement and AUP and SLA (one paragraph each!)

Helm can be used for deployment (recommended?) CI as a user-supplied requirement?
CPU, GPU, and storage numbers for winter/spring to set the scale (target and available)

Next meerting is November 3

https://indico.cern.ch/event/968366/