Total Perspective Vortex (TPV): Seamlessly managing job destinations in Galaxy when users can’t be trusted with resources
Sanjay K. Srikakulam
Albert-Ludwigs-Universität Freiburg
So, what exactly is Galaxy and what can it offer?
Distributing analysis across computing resources
https://pulsar-network.readthedocs.io
Galaxy ecosystem
TIaaS
BYOS
BYOC
UseGalaxy.EU
Total Perspective Vortex (TPV) - Motivation
Total Perspective Vortex
TPV – Configuring resource requirements for a tool
tools:
bowtie2.*:
cores: 6
mem: cores * 4
gpus: 0
env: []
destinations:
slurm:
cores: 24
mem: 64
gpus: 1
arc01:
cores: 8
mem: 32
gpus: 0
TPV will route to a destination where the job will fit
TPV – Intelligent resource selection
arc01 would have been a better fit because of the highmem tag, but it is currently marked offline, so slurm it is.
tools:
bowtie2.*:
cores: 12
mem: cores * 4
gpus: 0
env: []
scheduling:
require: []
prefer:
- highmem
accept:
reject:
- offline
rules: []
destinations:
slurm:
cores: 16
mem: 64
gpus: 2
scheduling:
prefer:
- general
arc01:
cores: 16
mem: 64
gpus: 0
scheduling:
prefer:
- highmem
reject:
- offline
TPV – Routing rules and inheritance
tools:
bowtie2:
cores: 4
mem: 16
rules:
- if: input_size > 10
cores: 16
mem: 32
- if: input_size >= 20 and input_size <=50
scheduling:
require:
- highmem
- if: input_size >= 55
fail: Size (input_size) is too large
Embedded Python expressions
Conditional tags
Contextualized errors
TPV – Routing rules and inheritance
tools:
bowtie2:
cores: 4
mem: 16
rules:
- if: input_size > 10
cores: 16
mem: 32
- if: input_size >= 20 and input_size <=50
scheduling:
require:
- highmem
- if: input_size >= 55
fail: Size (input_size) is too large
tools:
bwa_mem.*:
inherits: bowtie2
cores: 16
TPV – Metascheduling
global:
default_inherits: default
tools:
default:
cores: 1
mem: cores * 3.8
gpus: 0
env: []
params: []
scheduling:
reject:
- offline
rules: []
rank: |
final_destinations = helpers.weighted_random_sampling(candidate_destinations)
final_destinations
Custom rank functions, implemented via Python code, enable advanced metascheduling by selecting the optimal destination from a list, defaulting to the most preferred option if no custom function is provided.
TPV – Current developments
Job + Destination data
Best destination recommendation
TPV
Job data
Additional metadata / Recommended destination info
Jobs
Destination usage stats
Destination metrics
Summary
Thank you!
Object stores and Bring Your Own Storage
Bring Your Own Compute