Open Data Facility
Design Preview
v0.2 — for the IRIS-HEP SSL team
May 7, 2026
Aidan • Fengping • David • Farnaz • Judith
Rob Gardner · UChicago / EFI · May 7, 2026 with AI assistance from Claude
THE FRAME
Why ODF, why RP1, why now
Why ODF
Why RP1
Why now
Open Data Facility on RP1 — Design Preview v0.2
2 / 9
DESIGN
Four conceptual principles
Multiple interfaces, one substrate
BinderHub, JupyterHub, REANA — and a forward path to IRI-mediated and agentic-ready surfaces. Same backend, same data, same IAM.
Mirror, not copy
A US-side mirror of the CERN-curated open data with provenance preserved back to Zenodo / experiment DOI. We serve, we don't republish.
Capability without accounts
Three IAM-scoped tiers — Public, Vetted, Anointed — built on institutional SSO. No AF Unix accounts required.
Do good first, do better next week
Phased delivery driven by realized use rather than projected scale. Hardware in the margins; the real ask is engineering effort.
Open Data Facility on RP1 — Design Preview v0.2
3 / 9
REQUIREMENTS
Six use cases driving the design
From the November 2025 ATLAS Open Data tutorial. Every architectural choice serves these.
6.1
NTuple education / outreach
Stack A • BinderHub → JupyterHub
6.2
PHYSLITE columnar at scale
Stack B + A • JupyterHub + REANA
6.3
Statistical inference (pyhf)
Stack A • REANA (GPU-optional)
6.4
ML on HEP data (BDT / DNN / GNN)
Stack A • JupyterHub + REANA • GPU
6.5
MC event generation
Stack B • REANA + Apptainer
6.6
Systematics / PHYSLITEtoOpenData
Stack B • REANA + AnalysisBase
4 of 6 use cases are most natural in REANA — that's why REANA is a peer interface, not a luxury add-on.
Open Data Facility on RP1 — Design Preview v0.2
4 / 9
USERS
Three communities, three tiers
Out-of-band researcher
WORKLOAD
NTuple H→γγ exploration, individual ML experiments
TIER
Public → Vetted
INTERFACES
BinderHub → JupyterHub; REANA for re-runs
COMPUTE
2–8 cores · 0–1 GPU on request
STORAGE
0 / 50 GB home
Workshop/Classroom (~30–40)
WORKLOAD
Tutorial notebooks, paced exercises
TIER
Vetted (service-account)
INTERFACES
BinderHub (ephemeral) · JupyterHub (multi-day)
COMPUTE
2 cores/user · ≤1 shared GPU/session
STORAGE
5–10 GB scratch / student
HEP-ML developer
WORKLOAD
FM pre-training, BDT/GNN training, columnar pipelines
TIER
Vetted → Anointed
INTERFACES
JupyterHub (dev) + REANA + agentic surfaces
COMPUTE
Burst Dask cluster · 1–8 GPUs
STORAGE
50 GB home + ≥10 TB project
Open Data Facility on RP1 — Design Preview v0.2
5 / 9
CAPABILITIES
Four interfaces over one substrate
BinderHub
Ephemeral · public · lowest friction
Stateless containers built on demand.
For users who click 'launch' on a tutorial link.
Public • Vetted
JupyterHub
Interactive · persistent · AF-like
Tier-aware spawner profiles, persistent home,
project scratch on EOS. The dev surface.
Vetted • Anointed
REANA
Declarative · reproducible · async
Pinned-container workflows in YAML.
Right surface for production analyses.
Vetted • Anointed
Future: Intelligent (IRI / AmSC) + Agentic (MCP · CLI · SKILL)
Programmatic API for AmSC clients to drive ODF as an IRI service · opendata-mcp + reana-mcp + SKILL marketplace at rp1.hl-lhc.io/skills
Not every combination composes — see §4.8 of the design doc for the tensions table (containers vs interactive, agentic vs Public-tier security, sync vs async UX).
Open Data Facility on RP1 — Design Preview v0.2
6 / 9
SHAPE
Architecture at a glance
INTERFACES
BinderHub
JupyterHub
REANA
(future) Agentic / IRI
OWNER
Aidan & Fengping
COMPUTE & SCHEDULING
RP1 K8s
Kueue
Dask-Gateway
HTCondor
ServiceX/Y
OWNER
Aidan & Fengping
DATA PLANE
MWT2_OPENDATA (dCache / Rucio)
atlasopenmagic (metadata mirror)
User outputs (EOS potentially)
OWNER
Judith
EQUIPMENT & NET
ODF compute pool
GPU pool (A100)
≥25 Gbps fabric
CVMFS
OWNER
David & Farnaz
FEDERATION UC primary ↔ IU stretched-K8s ↔ Tempest / Pile ↔ NRP / AmSC burst
Open Data Facility on RP1 — Design Preview v0.2
7 / 9
ROADMAP
Phased plan
Phase 0
Foundation · 0–2 mo
KEY WORK PACKAGES
EXIT CRITERION
Internal user runs H→γγ end-to-end against the local mirror — as a notebook AND as a REANA workflow
Phase 1
Soft launch · 2–6 mo
KEY WORK PACKAGES
EXIT CRITERION
Classroom of 30+ external students completes a tutorial run, hands-off; one HEP-ML dev runs a non-trivial REANA workflow
Phase 2
Scale + open APIs · 6–12 mo
KEY WORK PACKAGES
EXIT CRITERION
AmSC client drives ODF programmatically; opendata-mcp + reana-mcp live; SKILL marketplace open
Phase 3
Graduate · 12+ mo
KEY WORK PACKAGES
EXIT CRITERION
First validated patterns graduate from RP1 to AF; ODF as one node in a federated US mirror
Open Data Facility on RP1 — Design Preview v0.2
8 / 9
DISCUSSION
Open questions for the team
Sampling of questions we can answer in v0.3.
1
Storage corpus posture
Open Data licensing for Rucio-mirrored hosting — confirm with ATLAS Open Data team. Initial dataset selection and replication-rule shape.
Owner: Judith
2
REANA deployment topology
Single instance shared with AF, or parallel on RP1 with shared storage and IAM? Workflow languages — Serial + Snakemake day-one, defer CWL/Yadage?
Owner: Aidan & Fengping (with AF REANA op)
3
Equipment & networking
ODF compute pool sizing, GPU class, ≥25 Gbps fabric to MWT2_OPENDATA dCache, CVMFS reachability for new node classes.
Owner: David & Farnaz
4
Public-tier security model
Threat-model and hardening plan with UChicago security before public exposure. Gates §4.7 agentic exposure too.
Owner: Aidan & Fengping & Judith
5
GPU contention with internal AF users
Scheduling policy for ODF GPU jobs vs internal ATLAS workloads. REANA-dispatched GPU jobs included in the same policy.
Owner: Aidan & Fengping (with Giordon & Rob)
6
Agentic-tool security posture
Even at Vetted tier, MCP tools wrapping shell execution are risky. Allowlist policy + audit logging — review before WP-19 lands.
Owner: Aidan & Fengping (with Giordon, Ilija and Rob)
Open Data Facility on RP1 — Design Preview v0.2
9 / 9