1 of 1

Fabric Operations (9/24)

  • NSLS2 HPC compute cluster purchase - PO dispatched to Supermicro
    • 2 Racks - will be located in LQCD room in BCF
    • 30 nodes - 18 without GPUs, 12 with 2x V100S GPUs
      • 2x Intel Xeon Gold 6252 Cascade Lake CPUs @ 2.10 GHZ (96 log. cores total)
      • 768 GB (12 x 64 GB DDR4-2933 DIMMs) memory
      • 1x EDR Infiniband
      • 2x 10 Gbps NICs (only one port used initially)
    • NSLS2 requires AD accounts/authentication
      • Systems will be partially isolated from the rest of the facility due to different account management system
      • Working with ITD on using Centrify as a solution to enable AD on these hosts
  • HTC/HPC Singularity upgrade
    • Resolved ‘singularity shell’ issue in 3.6 release
    • Test 3.6.x version on rplay53 updated to 3.6.3 to resolve security issues
      • Also testing on ATLAS T1
      • Target upgrade on entire farm in 1-2 weeks
  • Will’s password changing web interface implementing CS NIST800-63B requirement is live
  • Closing 85 ATLAS T1 (2015 purchase) nodes to jobs tomorrow, will be moved to shared pool shortly
  • Working on creation of local Docker registry with pull-through caching
    • Hardware racked/built, now configuring the service
    • Can be used with Singularity
  • Discussing sPHENIX VOMS alternatives with SDCC ISSOs & federated auth admins