1 of 16

Services and Tools for Collaborations: On-Boarding to the OSG Fabric of Services

Pascal Paschos1 Mats Rynge2 Jason Patton3,6

Fabio Andrijauskas4 John Thiltges5 Cannon Lock,3,6 &

Brian Bockelman3,6

1U. of Chicago 2U. of Southern California 3U. of Wisconsin-Madison 4U. of California San Diego 5U. of Nebraska-Lincoln, 6Morgridge Institute for Research

2 of 16

Mission of Collaboration Support Services

  • Facilitate (Midsize) Collaborations leverage OSG Consortium Services and Technologies for their research
  • Collaboration support coordinates service delivery by OSG/PATh teams in
    • CI integration between on-prem resources and execution & storage endpoints
    • User support
    • Access Points for onboarding
    • Data management solutions
    • User training, education, consultations, documentation, monitoring and regular engagements with collaboration representatives

3 of 16

What kind of collaborations

  • Multi-institutional collaborations in a range of maturity and scale
    1. Exploratory phase, e.g. a proposed HEP experiment. Engagement:
      • Informs planning and science mission deliverables e.g. simulations
    2. Deployment and growth phase. Engagement:
      • Provides a distributed platform of scale (OSPool) to develop, benchmark, test, improve workflows
      • Fosters growth in throughput and volume of data products
      • Trains researchers and technical staff
    3. Production phase. Engagement:
      • Consults and assists due to shortage of dedicated technical teams or expertise
      • Supports computing and storage infrastructure access and capacity
      • Assists in integration of collaboration owned resources into computing pools and connected storage endpoints

4 of 16

Support model and sustainability

Onboarding

Engaged

Active

Engaged

Inactive

Disengaged

Graduated

  • 11 engaged
  • 3 onboarding
  • 2 graduated
  • 3 disengaged
  • 3 inactive/engaged

5 of 16

A year of Collaborations dHTC jobs in 4 pools

6 of 16

OSG Collab Access Points and Storage

General Purpose OSG Collab AP - KOTO REDTOP SoLID EUSO-SBP2Trinity

XENON AP

SPT-3G APs

Snowmass -

FutureColliders

Legacy APs

OSG Connect AP - EHT EIC MOLLER LIGO* HepSim

OSPool

Dedicated Storage (dCache)

Open Science Data Federation Origin (OSDF) storage (Ceph)

Local Storage

Local Storage

Local Storage

Local Storage

Local Storage

Rucio-FTS

XRootD Door

7 of 16

The OSG Collab APs provide

  • An MFA-secured single place for researchers from multi-institutional collaborations to work together
  • Access to OSG/PATh expertise to assist in developing and running dHTC optimized workflows e.g. Pegasus WMS to the OSPool and software stacks in containers
  • Access to cvmfs
  • Storage allocations to the OSDF origin for data and software which are distributed in caches at global scale
  • Monitoring and accounting dashboards, e.g. GRACC
  • Access to portals for user accounting management
  • Local compute/storage capacity for pre-processing and post-processing data

8 of 16

PATh production services

  • Umbrella group for Collaboration Support
  • CI integration
    • OSG Hosted Compute Entry (CE) points
    • Assist sites to deploy their own self-managed CEs and APs
    • Operate the GlideinWMS framework: matches payloads and sends Glideins (pilots) to sites
    • Help define and assemble an HTCondor resource pool
    • Provide a path to transition ownership of infrastructure to the collaboration
  • Deploy and operate OSDF Origins/Caches
    • Assist sites to deploy their own
  • Host software containers in Oasis and Singularity cvmfs repos
  • Establish a long term relationship with remote site teams
    • Includes Consultation via embedding in teams

9 of 16

A prototype of CI in support of a Collaboration

OSG managed AP

Collab managed AP

OSPool

Collab pool

CE1, CE2, CE3, CE4

CE5, CE6

CE3, CE4

OSPool CM & Frontend

Collab CM & Frontend

Users

OSG VO

OSG/Collab VO

Collab VO

Dedicated

Opportunistic

Opportunistic

10 of 16

Data Management

  • OSDF Origin
    • Storage of the data from multiple Virtual Organizations (VO);�
  • OSDF Caches
    • Storage to provide data geographically close to the execution points and access points�
  • OSDF Redirector
    • Process the data request to direct the request to the appropriate origin�
  • HTCondor integrated tools for transfers
    • Plugin allows a user-specified source/destination for data using the appropriate protocol, e.g osdf:// protocol. Data will be downloaded through the OSDF caches.�
  • Rucio/FTS
    • Provides management of distributed storage to collaboration owned end-points (Rucio Storage Elements). Data are replicated via FTS across RSEs & a catalog keeps track of locations. Data can be downloaded via a Rucio client from the nearest RSE that contains it.

11 of 16

Cache 1

Cache 2

Cache 3

Redirector

Site 1

Site 2

Site 3

Site 4

RSE 1

RSE 2

On prem Experiment storage

Site 1

Site 2

Execution Points

Site 1

Site 2

Site 3

Site 4

Execution Points

Storage Sites

OSDF

Rucio

Full data can be distributed across RSEs

Frequently used data cached at the caching layer

Rucio

FTS

Origin

12 of 16

Track Collaboration FTS/Rucio usage

13 of 16

Support Effort at Global Scale

IGWN institutions

Dedicated LIGO pool sites

Access points to the LIGO pool

14 of 16

The KOTO story

  • J-PARC E14 KOTO experiment in Japan searches for new physics that breaks CP symmetry
  • Collaboration users from both Japan and the US submit out of the OSG Collab access point
  • An Engaged/Active collaboration supported in
    • Containerizing software stack, workflow development, job optimization, debugging and storage

15 of 16

Summary

  • Every collaboration is unique but important to have an adaptive template of engagement that sets a timeline for milestones and deliverables
  • Onboarding a collaboration to the OSG Fabric of Services as a guided process to develop independence and self-reliance
  • A sustainable engagement model that supports capacity at scale to meet short and intermediate term needs while cultivating expertise for long term success in the dHTC environment
  • A partnership rooted in a close working relationship and the continuing improvement of infrastructure in processes and tools

16 of 16

Acknowledgements

This work is supported by National Science Foundation under Cooperative Agreement OAC-2030508. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.