1 of 19

Proposing a �Scientific Software Distribution Service

Steffen Bollmann, Peter Marendy, Andy Botting, Aswin Narayanan, Audrey Stott, Sarah Beecroft, Greg Darcy, Jonathan Smillie, Lisa Phippard, Nigel Ward, Ryan Fraser, Hoylen Sue

 

2 of 19

Acknowledgement of Country

The University of Queensland (UQ) �acknowledges the Traditional Owners and their custodianship of the lands on which we meet.

We pay our respects to their Ancestors and their descendants, who continue cultural and spiritual connections to Country.

We recognise their valuable contributions to Australian and global society.

2

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

For this slide design, you can change the background colour

by right-clicking somewhere on the slide, not in a content placeholder.

‘Format Background

Under the ‘Fill’ menu, ‘change ‘Solid fill’ to a colour from the palette

3 of 19

Goal of techtalk today

3

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

Get input on how to do things better

Show what we want to build

Motivate to join the project

4 of 19

What is the problem?

4

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

How can we efficiently deliver software to infrastructure providers?

Security concerns?

Does the software work, and how can we test the functionality automatically?

Discoverability - finding the right software?

Citations of tools/containers?

5 of 19

Scope of the project

5

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

start with Neuroimaging and BioCommons containers + reference data

Automatically build and verify

Distribute across Australia

Secure Building

Distributing

6 of 19

History

  • This proposal aims at bringing together various recent developments in the community

6

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

Australian Biocommons Bring-Your-Own-Device Expansion Project

    • Aimed at providing shared reference datasets and containers across different infrastructures

AEDAPT NeuroDesk platform project

    • Containers automatically built via Github actions and distributed via CVMFS

European Environment for Scientific Software Installations (EESSI)

    • software layer with EasyBuild, Lmod and archspec; Gentoo Prefix compatibility layer; distribution via CVMFS

7 of 19

Proposed Architecture and Work packages

7

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

Container Recipe Generator

Automated Container builder

Functional Verification

Container Documentation

Container security scanning

Container DOIs

Container registry

Container Modules

Container distribution

8 of 19

Container Recipe Generator / Template / Renderer

  • Writing good docker recipes is tedious
  • Domain specific recipe generators like neurodocker are a great help
  • other domain agnostic tools
    • HPC Container Maker
    • Spack
    • Easybuild

8

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

9 of 19

Automated Container build

  • Github repository that collects Docker recipes from contributors
  • Recipes get build via github actions with custom runners on nectar cloud orchestrated via cirun.io + architecture specific builds via custom runners (e.g. ARM, AMD Milan …)
  • Periodic rebuilds for new versions?

9

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

10 of 19

Functional Verification

- test script with test data that we can run on nectar integrated in our build process?

10

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

11 of 19

Container Documentation

  • EDAM / Bio.Tools ?
  • Voting for new containers and discovering of what’s there -> github issues ?
  • how to find people who volunteer to build wanted containers and test them?
  • how to handle different hardware architectures / MPI versions ?

11

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

12 of 19

Container Security Scanning

  • Scan after build using github action azure container scan
  • Scan during storage in registry (e.g. Harbor)
  • Regular scans on distribution infrastructure (ideas :?)

12

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

13 of 19

Making Containers citable

  • Upload Singularity files to Zenodo and create DOI?
  • Or can we mint our own DOIs and link these to nectar object storage?

13

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

14 of 19

Container Registry

14

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

15 of 19

Container Modules

  • integration of containers in Module System in HPCs via Lmod
  • SHPC?
    • can SHPC automatically expose individual binaries inside a container? Currenlty every binary needs a separate entry in the yaml?
  • How to handle different sites?
    • every site has a subdirectory on distribution system with their own container subset and SHPC config?

15

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

16 of 19

Transparent Singularity – developed for Neurodesk

  • What users want:

  • What users would need to run:

  • Neurodesk: Automatically generate wrapper scripts for every application inside the container

https://github.com/NeuroDesk/transparent-singularity

16

Steffen Bollmann | @neuro_desk | http://neurodesk.github.io/

17 of 19

Transparent Singularity – developed for Neurodesk

  • Using the lmod module system we can now combine the tools from different singularity-containers in a larger workflow ☺

17

Steffen Bollmann | @neuro_desk | http://neurodesk.github.io/

18 of 19

Distributing Singularity containers via CVMFS

  • download and unpack singularity containers to CVMFS storage for distribution and on-demand access and distribute reference data
  • every user-site has their own subset of containers that they offer to their users (e.g. architecture optimized)

18

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg

Local SQUID proxy

HPC

Desktop

Laptop

Stratum 1: Adelaide

Stratum 1: Brisbane

Stratum 1: Perth

Stratum 0: Canberra

GeoIP

19 of 19

Discussion

  • Ideas and Thoughts?
  • We need a good name:
    • easy to find in google, free as a github organization

19

Steffen Bollmann | @sbollmann_MRI | http://www.github.com/ssdsorg