1 of 26

6th October 2023, Freiburg

EuroScienceGateway General Assembly meeting �Work Package 3

2022-12-31 by Name Surname

Grant agreement 101057388

2 of 26

Work package 3 - Pulsar Network: Distributed heterogeneous compute

1

European wide Network

Easily available to Users

Easily deployable for Providers

Presentation title | Name Surname

3 of 26

Work package 3 - Pulsar Network: Distributed heterogeneous compute

2

European wide Network

Easily available to Users

Easily deployable for Providers

Presentation title | Name Surname

4 of 26

The Pulsar network

Pulsar is a lightweight Python application that can be used for offloading Galaxy jobs on a remote cluster. Pulsar can automatically import the input data required to run the job and export the results back to the originating Galaxy instance.

The Pulsar Network is distributed job execution system, allowing to scale the computing resources available to Galaxy instances over heterogeneous compute facilities.

We did not start from scratch…

3

Presentation title | Name Surname

5 of 26

The Open Infrastructure

We did not start from scratch…

  • A virtual machine image, named Virtual Galaxy Compute Nodes (VGCN), that provides everything is needed to run Galaxy jobs.

  • Terraform scripts that take care of the infrastructure deployment over the Cloud resources

  • Ansible scripts to complete the Pulsar's configuration and have then an easy mechanism for its update.

4

  • Continuous testing
  • Continuous Deployment

Presentation title | Name Surname

6 of 26

Work package 3 - Pulsar Network: Distributed heterogeneous compute

5

  • At least 10 Pulsar endpoints, routing the incoming jobs from Galaxy and other workflow management systems to local compute resources.
  • 6 national Galaxy instances that will make use of the Pulsar Network

European wide Network

Easily available to Users

Easily deployable for Providers

Presentation title | Name Surname

7 of 26

Work Package 3 - Task 3.1

Task Lead: INFN

Task Members: ALU-FR, CESNET, CNR, IISAS

Goals:

  • Extend the Open Infrastructure for the Pulsar Network deployment.
  • Further extend to AWS, Azure and Google cloud and container orchestrator (k8s).
  • Include EOSC-compliant AAI to facilitate integration with other services.

Status:

  • GitHub: https://github.com/usegalaxy-eu/pulsar-deployment
  • Ansible roles, terraform recipes and documentation already available.

Develop and maintain an Open Infrastructure based deployment model for Pulsar endpoints (M1-M36)

6

Presentation title | Name Surname

8 of 26

Work Package 3 - Task 3.1

Open Infrastructure and VGCN image update.

Develop and maintain an Open Infrastructure based deployment model for Pulsar

endpoints (M1-M36)

7

Presentation title | Name Surname

9 of 26

Work Package 3 - Task 3.2

Task Lead: CESNET

Task Members: ALU-FR, CNR

Goals:

  • Implement support for the GA4GH Task Execution Service, allowing other services to submits jobs via TES to Pulsar and to the European Pulsar Network.
  • Two main drivers: 1) add support for TES+WES standards to Galaxy 2) provide access to Pulsar resources (via TES) for different WES services

Status:

  • TES spec: Task Execution Service (1.0.0)

Add the GA4GH Task-Execution-Service (TES) API to Pulsar (M1-M12)

8

Presentation title | Name Surname

10 of 26

Work Package 3 - Task 3.2

TESP (TES for Pulsar) is a separate microservice, decoupled from the Pulsar

  • implementing the TES standard
  • distributing TES tasks to Pulsar applications
  • currently using Pulsar REST API
  • tasks started in Docker containers (+ Docker compose usage)

The current version (https://github.com/CESNET/tesp-api) provides

  • TESP implementation, usage via curl or Snakemake
  • three methods for file transfer (S3, https, ftp)
  • support for development version of Galaxy TES runner

Add the GA4GH Task-Execution-Service (TES) API to Pulsar (M1-M12)

9

Presentation title | Name Surname

11 of 26

Work Package 3 - Task 3.2

A large effort was dedicated to testing compatibility with the development effort on Galaxy TES Runner. (https://github.com/galaxyproject/galaxy/pull/14462).

Current issues:

  • due to the development status of this Runner, there are currently limitations to this approach - workflow from Galaxy can be started on Pulsar node, but workflow output is not transferred back to Galaxy.
  • main limitation of the current version of Galaxy runner is missing support for creation and specification of Docker image, which should contain all tools required by specified workload.

Add the GA4GH Task-Execution-Service (TES) API to Pulsar (M1-M12)

10

Presentation title | Name Surname

12 of 26

Work Package 3 - Task 3.3

Task Lead: CESNET

Task Members: ALU-FR, VIB, EPFL, CESNET, BSC, CNRS, CNR, INFN, UiO, AGH / AGH-UST. IIAS, TUBITAK

Goals:

  • Deploy and maintain pulsar endpoints

Status:

Build an European-wide network of Pulsar sites (M7-M36)

11

Presentation title | Name Surname

13 of 26

Work Package 3 - Task 3.3

Many endpoints have been already updated:

  • DE* (ALU-FR) - Germany
  • IT02 and IT03 (CNR) - Italy
  • SK01 (IISAS) - Slovakia
  • FR01 (CNRS - GenOuest) - France
  • CZ01 (CESNET) - Czech Republic
  • EGI01 (EGI and INFN) - Italy
  • TUBITAK ULAKBIM setup is almost completed

… and counting.

Build an European-wide network of Pulsar sites (M7-M36)

12

Presentation title | Name Surname

14 of 26

Work Package 3 - Task 3.4

Task Lead: BSC

Task Members: UNIMAN

Goals:

  • Extend WfExS to support ESG as compute platform
  • execute task on the Pulsar Network using TES API developed in T3.2

Status:

Add TES support to WfExS (Workflow Execution Service) (M18-M36)

13

Presentation title | Name Surname

15 of 26

Work Package 3 - Task 3.5

Task Lead: VIB

Task Members: ALU-FR, UiO, UB, CNRS, CNR

Goals:

  • Develop and maintain an Open Infrastructure for deploying National Galaxy instances.
  • Deploy National Galaxy instances to access local infrastructure and the Pulsar Network.
  • User support

Status:

  • Github: https://github.com/usegalaxy-eu
  • Ansible roles and terraform recipes available. Some useGalaxy national instances (Belgium, France) already up and running.

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

14

Presentation title | Name Surname

16 of 26

Work Package 3 - Task 3.5

The Open Infrastructure to deploy full fledged usegalaxy.eu replica servers, thus allowing to instantiate new usegalaxy services easily, but also providing a robust framework for maintaining and updating running instances.

Started draft documentation (temporary repository):

  • https://usegalaxy-it.github.io/documentation/

Currently we have 7 endpoints:

  • EU (ALU-FR), Be (VIB), Fr (CNRS), Es (BSC-CNS), No (ELIXIR-NO), Cz (CESNET), It (CNR) deployment is ongoing.

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

15

Presentation title | Name Surname

17 of 26

Work Package 3 - Task 3.5

EU (ALU-FR):

  • Leads the development of the Open Infrastructure framework.
  • Upgraded from Galaxy 23.0 to Galaxy 23.1.
  • Switched from Sorting Hat to Total Perspective Vortex (TPV) meta-scheduler ( https://galaxyproject.org/news/2023-05-08-tpv-switch/).
  • Onboarded multiple Pulsar endpoints from diverse partners.
  • Added a ‘remote resources’ dropdown menu to the user preferences that lets users select a specific pulsar endpoint and integrate it with TPV, thus enabling the scheduling/distribution of jobs to the respective pulsar endpoints.
  • Enabled deployment of national and domain-driven Galaxy instances by knowledge transfer and support.

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

16

Presentation title | Name Surname

18 of 26

Work Package 3 - Task 3.5

IT (CNR):

  • Test instance has been deployed with the OI (Galaxy 23.0) with a pulsar endpoint.
  • The infrastructure automation framework and documentation are hosted on Github:

New hardware resources have been acquired in the context of the PON project CNR.BiOmics, while more will be acquired in the context of the ELIXIRxNextGenIT RRF project. Some of those resources will be dedicated to the UseGalaxy.it server, whose main production instance will be deployed at the ReCaS-Bari data center.

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

17

Presentation title | Name Surname

19 of 26

Work Package 3 - Task 3.5

Be (VIB):

  • The galaxy version was upgraded from 21.01 to 23.0
  • The postgres version was upgraded from 9 to 15
  • The OS was upgraded from CentOS 7 to Rocky Linux 8
  • Started migration to Total Perspective Vortex
  • Configured rabbitMQ
  • Added a Pulsar endpoint (still requires some more testing)

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

18

Presentation title | Name Surname

20 of 26

Work Package 3 - Task 3.5

Fr (CNRS)

  • Upgraded to version 23.0
  • Upgraded to latest TIaaS version
  • Started migration to Total Perspective Vortex
  • Installed and configured tools produced by the Biodiversity use case of WP5

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

19

Presentation title | Name Surname

21 of 26

Work Package 3 - Task 3.5

Cz (CESNET):

CESNET deployed a production version on https://usegalaxy.cz, as a collaboration of e-INFRA CZ and ELIXIR CZ.

  • Version 23.0, with TPV support, aiming to deploy the majority of tools installed on usegalaxy.eu.
  • Installation with Pulsar, connecting to e-infrastructructure PBSPro, which provides dedicated compute nodes (including GPU nodes) to Galaxy.
  • Currently several Pulsar nodes are installed, to support usegalaxy.cz, usegalaxy.eu and another installation specific to ELIXIR services, looking for better support in one installation.
  • Specific AAI setup, usegalaxy.cz provides both Lifescience AAI (to support ELIXIR users) and e-INFRA CZ AAI (users of national e-infrastructure).

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

20

Presentation title | Name Surname

22 of 26

Work Package 3 - Task 3.5

Es (BSC-CNS):

Instance currently implemented through Openstack, utilizing the cloud resources available at BSC.

  • Setup of a PostgreSQL database.
  • CVMFS for reference data.
  • Over 1000 different pre-installed tools.
  • Slurm as workload manager.
  • Working to expand its capacity and availability, through additional Slurm.
  • The deployment of a Pulsar endpoint is already scheduled

Developing and maintaining national or domain-driven Galaxy servers (M1-M36)

21

Presentation title | Name Surname

23 of 26

Conclusions and next steps

Short-term goals:

  • Finish the documentation for both Pulsar and (Use)Galaxy endpoint deployment.
  • Hardening the Pulsar endpoints already available and deploying new ones.

Long-term objectives

  • Other workflow management systems will be enabled to submit jobs to this distributed compute network.

22

Presentation title | Name Surname

24 of 26

Thank you for your attention!

Marco Tangaro (CNR)

Federico Zambelli (CNR and UniMi)

Bjoern Gruening (ALU-FR)

Mira Kuntz (ALU-FR)

Sanjay Kumar Srikakulam (ALU-FR)

Stefano Nicotri (INFN)

María Chavero-Díez (BSC-CNS)

Josep Ll. Gelpi (UB)

Anthony Bretaudeau (CNRS)

Eva Mercier (CNRS)

The work package 3:

23

Hakan Bayindir (TUBITAK ULAKBIM)

Jan Astalos (IISAS)

Viet Tran (IISAS)

Sebastian Luna-valero (EGI)

Lukasz Opiola (AGH-UST)

Olivier Collin (CNRS)

Miroslav Ruda (CESNET)

Josef Handl (CESNET)

Presentation title | Name Surname

25 of 26

BACKUP

24

Presentation title | Name Surname

26 of 26

Objectives - Task 3.4

WfExS is a high-level workflow execution service backend, developed within EOSC-Life as part of Demonstrator 7 (D7), which can manage workflows across different domains.

It has a strong focus on reproducible and replicable analysis by using digital objects like RO-Crate.

  • Fetches workflows from WorkflowHub.
  • identifies the workflow type and run it using its native workflow execution engine (currently CWL and NextFlow).
  • Identifies the containers needed by the workflow and fetches them.
  • Optionally describes the results with a RO-Crate and makes them available to users.

How are we planning to achieve the objectives?

25

Presentation title | Name Surname