1 of 6

1

The Role of Research Software Engineers

at the San Diego Supercomputer Center

Frank Würthwein

Director SDSC

November 13th , 2022

2 of 6

SDSC’s Mission

  • SDSC adopts and partners on innovations from industry and academia in the areas of software, hardware, computational & data sciences, and related areas, and translates them into cyberinfrastructure that solves practical problems across any and all scientific domains and societal endeavors.
  • We are globally renowned practitioners in translating innovation into practice. 

2

Translating Innovation into Practice

3 of 6

Intro of SDSC by Numbers

3

250++

Employees

~3,000

Training & Event

Participants/year

~10,000

Active Unix

Accounts

on HPC systems

1M++

users on

Science gateways

4 HPC Systems

~200,000 x86 cores

~1,500 GPUs

AI/ML Supercomputing

Habana/Intel hardware

SDSC Expertise

Universal Scale Storage

Open for business from

200TB to 10’s of PB

Globally Federated

Cyberinfrastructure

100++ institutions

on 5 continents

We design, deploy, and operate end-to-end solutions for our partners

from academia, government, industry & non-profits

1M++

Students took

our Big Data courses

4 of 6

Introducing myself

Employment History

  • Director SDSC since December 2021
  • Tenured faculty at UCSD since 2003
    • Dual appointment in Physics & Data Science
  • MIT junior faculty 1999-2003
  • Caltech Millikan Fellow 1995-1999
  • Cornell PhD 1995

Experimental Particle Physicist analyzing data from the Large Hadron Collider. H-index 186. Former students and post-docs now faculty at a dozen institutions.

Physics Advisory Committee of LIGO

Software & Comp. Adv. Committee of IceCube

Executive Director of OSG

PI of NRP

Cyberinfrastructure Research

  • The Open Science Grid (642 citations)
  • The Pilot Way to Grid Resources using glideinWMS (306 citations)
  • Using XRootD to Federate Regional Storage (55 citations)
  • XRootD, disk based caching proxy for optimization of data access, data placement and data replication (29 citations)

Designed, built, and operated globally distributed compute and data federations since 2004. Used by the majority of all global collaborations in Astronomy and Physics, including two that lead to physics nobel prizes. Used now across all of science, including social science.

Adopted by NSF as basis for solicitations.

5 of 6

What Roles do RSE have at SDSC?

  •  

5

RSEs are found among all 4, but mostly work in 2 & 3.

6 of 6

Why are RSE’s needed in HPC?

  • Everything we do depends on the talent of our people & our ability to grow that talent.
    • See e.g.: NSF award lead by MaryThomas
    • make applications perform on systems that:
      • employ heterogenous compute subsystems
      • are integrated with high performance storage systems
      • ingest and move PB of data over national R&E networks
    • design & implement application performance improvements exceeding “Moore’s law”
      • deep engagement with domain scientists to reduce time to insight much faster than Moore’s law
    • refactor complex domain science software systems to make them 
      • sustainable and maintainable
      • allow modular plug & play for different algorithms that solve the same problem via conceptually different ways
      • automatable, science reproducible, …
    • engineer paradigm shifts in domains by translating technical innovations into practice

6