1 of 19

PATh/OSG Collaborations

An Update from the Collaboration Support (CS) Area

1

PATh Staff Meeting

September 2023

Pascal Paschos

Collaboration Support Lead

University of Chicago

collab_support@osg-htc.org

2 of 19

Objectives of this presentation

  • Update on activities since the last report - April 2023
  • Provide a status refresher on collaborations - where applicable
  • Note any issues or concerns faced by Collaboration Support

2

3 of 19

Scope of CS Group

  • Collaboration support coordinates and facilitates service delivery between PATh support teams and collaborations
  • Group under the PATh Production Services
    • Core members: Pascal Paschos, Jason Patton
    • Affiliated members: Mats Rynge, Cannon Lock, Fabio Andrijauskas
    • Contributions from UC Infrastructure team and all other PATh teams
  • Extensive details are in the PATh Collaborations monthly reports to the NSF

3

4 of 19

Areas of Effort

4

5 of 19

CS Priorities

  • Outward facing: Raise visibility and communicate by defining CS services to the wider research community
  • Strategy: Implement a sustainable model of support to Collaborations by the OSG Fabric of Services
  • Inward facing: Promote internal coordination between contributing PATh teams to deliver end-to-end service

5

6 of 19

Graduation Model of Engagement

6

Onboarding

Engaged

Active

Engaged

Inactive

Disengaged

Graduated

7 of 19

Engagement Tracking

7

8 of 19

Cross-Cutting Highlights

  • CHEP Conference: Attended and presented to the HEP community which is the primary consumer of CS services
  • HTC23 Conference: Organized the Collaborations Session
  • Bi-weekly Roundtable: An opportunity of collaborations to exchange between them and PATh staff ideas and best practices on common challenges
  • 2nd Collab day Madison: Set priorities for CS for the next 3-4 months.
  • New PATh Collab AP and Origin: A joint UC-PATh project to migrate research to new AP
  • Two new* engagements: 1) ePIC - an EIC project. 2) MOLLER VO
  • Data Management dashboard: Track data management for select collaborations

8

9 of 19

2nd Collaboration Day July 2023

  • Debugged using stashcp to access new OSDF origin on CephFS
  • Discussed project-based scopes to access group directories with tokens. Subsequently implemented.
  • Iterated over needed changes in the OSPool config to account for variations between APs and understand limits in convergence.
  • Discussed the path to provide a token-based solution for APs (e.g. SPT) that don’t use OSDF but a local dCache to access data with x509s. Also transition from a gsiftp to a https protocol. Currently being implemented.
  • Other topics: Provide better documented approach in using bearer tokens to interact with the Origin from outside the PATh Infrastructure.

9

10 of 19

New PATh Collab AP and Origin

  • Operationalized new Collab AP ap23.uc.osg-htc - replaced login.collab.ci-connect.net
  • OSDF origin for ap23 is on the Ceph cluster on Tempest
  • Desired functionality with group based access is enabled and tested
  • Shared config with an OSPool AP - with a small number of differences
  • Migrated ~130 users and their projects from login.collab and login.xenon
  • Transferred 120 TB of data from Typhoon to Tempest
    • Additional 70 TB being transferred from login.collab scratch to ap23 scratch
  • Assisted collabs with a range of modifications in the workflows and access patterns
  • Helped collabs write documentation to their users for the transition

10

11 of 19

New* Engagements

  • Two projects came under our purview
    • ePIC
      • A project under the EIC VO. The osg.eic project remains unchanged.
      • Joint effort with JLab
      • Dual AP (JLab and OSG Collab) under the same project name
      • Submits to the OSPool and the EIC pool
    • MOLLER
      • Not new. Pre-existing project osg.MOLLER migrated to collab.MOLLER
      • Joint effort with JLAB that is setting up a separate AP
      • To be a separate VO and submit to its own pool and the OSPool

11

12 of 19

Data Management Dashboard

12

13 of 19

In other news

13

14 of 19

Compute Usage (since April)

14

150 million CPU-core hours

Multiple pools

OSG & Collab AP

15 of 19

OSDF Origin Status

15

x ½

16 of 19

Other Highlights

  • Completed deployment of an RSE for XENON on NRP storage
  • Completed changes in KOTO pipeline - was causing lot of jobs to be ejected due to running over limits in walltime
  • Completed assisting Georgia Tech with reploying their CEs
  • A tread away from self-managed CEs in preference for a hostedCEs
    • cvmfs-exec default for HostedCE?
  • Typical issues across collaborations
    • Sites dropping - some expired hostcert, cvmfs or cvmfs repos unavailable. ARC CEs is special class of issues.
  • Emerging question: How can I run on the (OS)Pool and use my token to write places other than storage on the AP

16

17 of 19

Sample to-do list

  • Complete updates to HTCondor 10:
    • IGWN and JLab FE - also still on OSG3.5
    • Remaining APs on Connect - e.g. SPT; though they are on OSG3.6
  • Test OSDF access for LIGO using SciTokens - still on using X509s
  • Replace remaining GridFTP doors with WebDaV ones wherever possible
  • Continue deployment of RSEs for XENON on NRP storage provisioned by Frank
  • Guide MOLLER through the process of forming their own pool
  • Plan for Snowmass21 data. Moved to BNL and local disk. Used by futurecolliders.
  • Long term plan: Migrate Rucio services to Tempest

17

Token based access to the OSDF origins from outside our infrastructure

18 of 19

Summary

  • Overall, the summer was dominated by tasks for the transition
  • The deployment of the OSDF origin at UC was met with many challenges
  • Looking forward to normalizing ops out of ap23 for collaboration projects
  • Will continue to phase out usage of X509s wherever possible and as we identify opportunities
  • Communication with sites and staff bandwidth remains the number 1 barrier in resolution efficiency when supporting multi-institutional organizations at a global scale

18

19 of 19

Questions?

19