1 of 11

Status of the in-development ARC job runner for Galaxy

Send Galaxy jobs to remote sites with Nordugrid ARC middleware

2 of 11

Key ARC components

  • Nordugrid ARC is widely used today in the Worldwide LHC Computing Grid (WLCG) as one of the two recommended middlewares connecting the grid sites.

  • ARC CE – a Compute Element, providing interfaces to computing resources – allowing distributed computing
    • Modular, consists of several sub-components (services and utilities)
    • Interface for job control
    • Interface for exposing resource and job status info
  • Optimised for HPC deployment
    • User traceability
    • Built-in data staging and caching on the ARC-CE – no middleware on the compute node -> no inbound connectivity needed
      • In contrast with the standard Pilot mode (e.g. used by DIRAC) – but ARC can also run in pilot mode if needed

  • Resource usage Accounting
    • Local accounting database
    • Integrated with community accounting servers APEL and SGAS – ARC sends records�
  • API: C++ and Python, for interfacing to full software stack
    • Enables custom framework integration
    • CLI built upon API – job management, information, etc. (arcget, arcsub, arc*)�
  • Security Layer:
    • Token support
    • Originally built around X509
    • Very powerful and flexible authorization schemes on the server side (identity mapping and authorization control)�
  • Soon to be officially released: new ARC python REST client

European Galaxy Days - ESG annual meeting - Maiken Pedersen

1

October 2023

3 of 11

ARC overview

European Galaxy Days - ESG annual meeting - Maiken Pedersen

2

October 2023

Batch system backends

REST/HTTPS

Batch system backends

Infoprovider scripts

Data transfer system

ARC-CE

A-REX

Security layer – token and x509

Authentication services

Compute nodes

Frontend/login server

Remote HPC center/computing site

4 of 11

ARC – new “pulsar” flavour

European Galaxy Days - ESG annual meeting - Maiken Pedersen

3

October 2023

  • The added benefit of including ARC into the already existing Galaxy-pulsar network
    • ARC handles datastaging from remote sources
    • ARC is designed to meet requirements of HPC systems
  • ARC can be installed in addition or instead of a Galaxy pulsar node for remote submission of Galaxy jobs
  • For ARC: a chance to open up to communities outside high energy physics

5 of 11

ARC Galaxy job runner

  • Using Galaxy dev-branch
    • Which includes token renewal mechanism
  • Jobs are sent from Galaxy to a remote ARC endpoint with always fresh token from Galaxy.
    • Jobs are accepted or rejected according to the validity of the token and the configuration of ARC
  • Added a WLCG IAM IdP into social-core and configured galaxy to use this
  • Input files uploaded to the job in Galaxy (local) are sent to the ARC endpoint
    • Implemented path rewrites for input-files (only so far):

/bin/bash './runhello.sh' "Hi from galaxy job wrapper" --test >> command_out.txt; /bin/echo "Some post processing job" >> command_out.txt

  • Input files specified to be collected at remote storage is handled by ARC at the remote compute site
  • Depends on the newly written ARC python REST client pyarcrest
    • Not yet officially released

European Galaxy Days - ESG annual meeting - Maiken Pedersen

4

October 2023

6 of 11

European Galaxy Days - ESG annual meeting - Maiken Pedersen

5

October 2023

ARC workdir on the remote site

Galaxy workdir on Galaxy server

7 of 11

Next steps

  • Make ARC Job runner work for ”any” tool from the toolbox
    • Or identify what kind of tools will not work
  • Want remote datastaging to work nicely from Galaxy – as this is one of the key strengths of ARC
  • Handle filtering of output data once all outputdata is fetched from ARC to Galaxy
    • How to is being discussed with the Galaxy backend devs (Marius, John ...)
    • Currently all files in the ARC jobs workdir are collected back to Galaxy – which are all input and output files (could easily filter out input-files, but not yet done)

European Galaxy Days - ESG annual meeting - Maiken Pedersen

6

October 2023

8 of 11

Links

European Galaxy Days - ESG annual meeting - Maiken Pedersen

7

October 2023

9 of 11

Discussion points for developers for further work

European Galaxy Days - ESG annual meeting - Maiken Pedersen

8

October 2023

10 of 11

Todo’s – discussion points

  • Agree on a convenient way to specify list of remote input files
    • ARC needs a list of urls for remote files
    • Prototype solution – A file containing a list of urls uploaded in the job
  • In Galaxy, the user should be able select from what IdP he wants to use to generate a token needed for a particular ARC job – which is not necessarily the one used to log into Galaxy
    • Is this something that will come?
  • In Galaxy, the user should be able to select a remote endpoint via a drop-down menu, and himself be able to add/remove/edit remote endpoints that he has access too

European Galaxy Days - ESG annual meeting - Maiken Pedersen

9

October 2023

11 of 11

Todo’s – discussion points - path rewrites

  • Jobs run on the remote ARC system – no shared Galaxy filesystem.
    • Paths must be rewritten e.g. /galaxy/data/datasets/4/5/c/dataset_blah.dat -> ./real_filename.txt
    • Straight forward if command only contains paths to input or output datasets
    • But - how to handle complex command-lines containing reference datasets

ln -f -s '/storage/galaxy/data/datasets/4/5/c/dataset_45cf1bcb-28b1-4167-878d-1fb17636064e.dat' reference.fa && minimap2 --q-occ-frac 0.01 -t ${GALAXY_SLOTS:-4} reference.fa '/storage/galaxy/data/datasets/4/5/c/dataset_45cf1bcb-28b1-4167-878d-1fb17636064e.dat' -a | samtools view --no-PG -hT reference.fa | samtools sort -@${GALAXY_SLOTS:-2} -T "${TMPDIR:-.}" -O BAM -o '/storage/galaxy/data/jobs/000/228/outputs/dataset_03e29ceb-8733-4836-8981-90391c58105f.dat’

  • From runner need to identify these reference datasets as I can identify input and output datasets (job_wrapper.get_job().get_input_datasets() job_wrapper.get_job().get_output_datasets())
  • Pulsar already handles this – what would be the lightest way of adding this functionality to work with the ARC job runner?
  • How is DIRAC handling this?

European Galaxy Days - ESG annual meeting - Maiken Pedersen

10

October 2023