"Smarter" job scheduling using (user) provided infrastructure
Grant agreement 101057388
EuroScienceGateway
Leveraging the European compute infrastructures for
data-intensive research guided by FAIR principles
1
Smarter job scheduling | Paul De Geest
Institutions involved
2
Smarter job scheduling | Paul De Geest
Motivation
3
National Cloud and HPC infrastructures have been established, with differences in
goal: provide efficient and structured access to data, tools and workflows supported by suitable IT infrastructures.
Smarter job scheduling | Paul De Geest
The Pulsar Network
4
The Pulsar Network is a distributed job execution system, allowing to scale the computing resources available to Galaxy instances over heterogeneous and distributed compute facilities.
Smarter job scheduling | Paul De Geest
User provided infrastructure
Bring Your Own Compute
5
Smarter job scheduling | Paul De Geest
User provided infrastructure
Bring Your Own Compute
6
Smarter job scheduling | Paul De Geest
User provided infrastructure
Bring Your Own Compute
7
Smarter job scheduling | Paul De Geest
User provided infrastructure
Bring Your Own Storage
8
Smarter job scheduling | Paul De Geest
Smart job-scheduling
9
How do we efficiently schedule jobs from any UseGalaxy.* server to any Pulsar endpoint in the Pulsar network or a user-defined compute endpoint?
Smarter job scheduling | Paul De Geest
Set up
10
Smarter job scheduling | Paul De Geest
Set up
11
TPV
Meta-scheduler
Job Visualization
Smarter job scheduling | Paul De Geest
Set up
12
TPV
Meta-scheduler
Job Visualization
BYOC
BYOS
Smarter job scheduling | Paul De Geest
Time-series Database
13
Central database for collecting time-series info about all Pulsar destinations in the network:
Smarter job scheduling | Paul De Geest
Current and aggregate stats
14
Median job count
Median destination-tool queue/run times
Median Pulsar Load
Smarter job scheduling | Paul De Geest
Total Perspective Vortex
15
Smarter job scheduling | Paul De Geest
Total Perspective Vortex
16
Smarter job scheduling | Paul De Geest
Meta-scheduling
Central API endpoint
17
Implements matchmaking logic, ranking available pulsar destinations based on:
Smarter job scheduling | Paul De Geest
Meta-scheduling
Central API endpoint
18
Implements matchmaking logic, ranking available pulsar destinations based on:
Smarter job scheduling | Paul De Geest
Meta-scheduling
Central API endpoint
19
Implements matchmaking logic, ranking available pulsar destinations based on:
Currently, simple weighting of the different metrics
Working on add more advanced Fuzzy/Adaptive-based matchmaking algorithms
Smarter job scheduling | Paul De Geest
Set up
20
TPV
Meta-scheduler
Job Visualization
BYOC
BYOS
Smarter job scheduling | Paul De Geest
Visualisation
21
Smarter job scheduling | Paul De Geest
Future Work
Enhancing data locality and performance by unifying multiple (user-) object stores into a single OneData object store
22
Smarter job scheduling | Paul De Geest
Thanks!
All ESG members and specifically
23
Smarter job scheduling | Paul De Geest
Institutions involved
24
Smarter job scheduling | Paul De Geest
The Pulsar Network
25
The Pulsar Network is a distributed job execution system, allowing to scale the computing resources available to Galaxy instances over heterogeneous and distributed compute facilities.
Smarter job scheduling | Paul De Geest
Smart job-scheduling
26
Smarter job scheduling | Paul De Geest
Smart job-scheduling
Add some in practice example of the workflow (tpv ranking function, example statistics we collect, API?, example algorithms in the API?)
27
Smarter job scheduling | Paul De Geest
Smart job-scheduling
28
Smarter job scheduling | Paul De Geest
Smart job-scheduling
29
Data collection
Smarter job scheduling | Paul De Geest
Smart job-scheduling
Data locality
30
Smarter job scheduling | Paul De Geest
Smart job-scheduling
Metascheduling
31
Smarter job scheduling | Paul De Geest
Motivation
Galaxy can be deployed on a laptop
32
Smarter job scheduling | Paul De Geest
Motivation
Galaxy can be deployed on top of a large compute cluster
33
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Existing approach and way forward
34
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Galaxy File Source Plugin added for Onedata
35
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Bring your own Object Storage via S3
36
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Bring your own Object Storage via S3
37
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Own storage as default one
38
Smarter job scheduling | Paul De Geest
Bring Your Own Storage (BYOS)
Choose preferred storage per history, workflow and tool
39
Smarter job scheduling | Paul De Geest
40
Deliverable/Milestone | Due date | Verification Method | Progress Status (%) |
D4.1 Bring Your Own Infrastructure (compute, storage) Demonstrator | 31-Aug-2024 | Report | 50% |
D4.2 Publication on the smart job scheduler implementation | 28-Feb-2025 | publication | 20% |
M4.1 BYOC and BYOS integrated into ESG | 31-Aug-2023 | Software available | 100% |
M4.2 Meta-scheduler model for job optimisation available | 28-Feb-2024 | Software available | 30% |
Smarter job scheduling | Paul De Geest
Collaborations & Future Work
Björn Grüning (ALU)
41
Smarter job scheduling | Paul De Geest
Collaborations & Future Work
ESG/Galaxy and related projects in previous EOSC projects
42
Smarter job scheduling | Paul De Geest
Collaborations & Future Work
ESG/Galaxy and related projects in previous EOSC projects
43
Smarter job scheduling | Paul De Geest
Collaborations & Future Work
additional projects
44
BioModels
Smarter job scheduling | Paul De Geest
Collaborations & Future Work
The next year!
45
BYOS
BYOC
Smarter job scheduling | Paul De Geest
47
48
49
Introduction and Overview
EuroScienceGateway will leverage a distributed computing network across 13 European countries, accessible via 6 national, user-friendly web portals, facilitating access to compute and storage infrastructures across Europe as well as to data, tools, workflows and services that can be customized to suit researchers’ needs. At the heart of the proposal workflows will integrate with the EOSC-Core. Adoption, development and implementation of technologies to interoperate across services, will allow researchers to produce high-quality FAIR data, available to all in EOSC. Communities across disciplines -- Life Sciences, Climate and Biodiversity, Astrophysics, Materials science -- will demonstrate the bridge from EOSC's technical services to scientific analysis.
50
Smarter job scheduling | Paul De Geest
Introduction and Overview
51
Smarter job scheduling | Paul De Geest
Introduction and Overview
52
Smarter job scheduling | Paul De Geest