1 of 61

The Task Execution Service (TES) API

Kyle Ellrott & Alex Kanitz

2 of 61

1.0. Introduction

3 of 61

GA4GH Cloud Work Stream APIs

Sharing Tools and Workflows

Executing Workflows

Executing Individual Tasks

Accessing Data

Data Repository Service, DRS

4 of 61

Task Execution Service (TES) API

A way to send a request to run a Docker-based tool in a remote environment, monitor progress, and retrieve the result. (TES does tasks, WES does workflows)

POST new task

GET task status

GET task stderr/stdout

API Standard to Execute

Tools

Docker

JSON

stderr

stdout

file(s)

status

+

Cloud-specific Implementation

Official GA4GH�Standard!

5 of 61

TES 1.1 Features

  • Allow wildcards in output paths
    • https://github.com/ga4gh/task-execution-schemas/pull/185
  • Changing int64 to int32 for page_size and cpu_count
    • https://github.com/ga4gh/task-execution-schemas/pull/175
  • Add in new tesState for preempted state
    • https://github.com/ga4gh/task-execution-schemas/pull/184
  • Add Bearer security scheme & apply globally
    • https://github.com/ga4gh/task-execution-schemas/pull/179
  • Add filtering features to task listing
    • https://github.com/ga4gh/task-execution-schemas/pull/170
  • Add ignore_error flag
    • https://github.com/ga4gh/task-execution-schemas/pull/159
  • Add streamable flag to tesInput
    • https://github.com/ga4gh/task-execution-schemas/pull/157
  • Make file type an optional argument
    • https://github.com/ga4gh/task-execution-schemas/pull/155

6 of 61

TES Ecosystem

TES-compatible CWL workflow engine by Seven Bridges Genomics.

github.com/rabix/bunny

WDL workflow engine built by The Broad Institute

github.com/broadinstitute/cromwell

Implementation of TES running on Kubernetes

github.com/EMBL-EBI-TSI/TESK

Microsoft-developed and maintained TES implementation on Azure

https://github.com/ga4gh/tes

TES endpoint supporting HPC clusters and multiple clouds

https://github.com/ohsu-comp-bio/funnel

7 of 61

2.0. Workflow Engine &�TES Client Updates

8 of 61

2.1. Cromwell & Nextflow

9 of 61

Cromwell on Azure powered by TES on Azure

Production-ready, FOSS implementation of the Broad Institute's Cromwell workflow engine on Azure, powered by TES on Azure for max scale, cost-optimized genomics workflow execution

10 of 61

Nextflow

  • Beta testing TES 1.1 compatible plugin
  • Now supports authentication to TES API

11 of 61

2.2. Snakemake & CWL

12 of 61

Snakemake

TES executor implemented by ELIXIR Cloud & AAI Driver Project in 2020/2021 (PR)

Update to TES v1.1 planned

Potentially also a more native / less opaque implementation (currently each task is basically a 1-step Snakemake workflow)

13 of 61

CWL

  • cwl-tes - upgrade to TES v1.1 plan, but pending py-tes upgrade to v1.1
  • Used in cwl-WES, which is currently undergoing a major refactoring;�to be released very soon (head commit currently broken 😢)

14 of 61

Demo: Use TES to bring compute to data

Uses cwl-tes (and Snakemake)* with TES gateway and simple distribution logic to bring compute to data across a heterogeneous network of TES instances (HPC and native cloud)

* works in principle, but needs a bit more testing/integration into existing demo

  • The CWL demo can be found here (recording of live demo)
  • The Snakemake demo can be found here

15 of 61

Compute federation

16 of 61

Compute federation: Gateway

17 of 61

Compute federation: CWL workflow

18 of 61

Future directions

  • Support for additional client-side (Nextflow, Galaxy) & server-side (TES on Azure, Pulsar) implementations
  • Support for sensitive data (encryption, isolation) via GA4GH Cloud standards
  • More sophisticated distribution logic (e.g., taking into account constraints of where data can move)
  • Interoperable authorization/credentials

19 of 61

2.3. Galaxy

20 of 61

Galaxy

  • Preliminary version implemented by Vipul Chhabra as part of Google Summer of Code project in 2021
  • Final version implemented by Galaxy Dev team
  • Integrated into Pulsar job distribution system
  • Multiple modes of operation depending on Pulsar configuration

21 of 61

2.4. py-tes

22 of 61

py-tes

Python client for accessing TES servers

pip install py-tes

Currently supports v1.0, will be updated to v1.1

23 of 61

3.0. TES Server Updates

24 of 61

3.1. TES on Azure

25 of 61

TES on Azure v4.4

  • Currently the compute backbone for workflow execution in Terra on Azure and CoA
  • Production-ready, developed and supported by the Microsoft Biomedical Platforms and Genomics team (Microsoft Research) since 2019
  • Engineered & tested with 1000s of concurrently running VMs (TES tasks) for max in-region scale out
  • Automatic cost-optimization; run any Docker image on any Azure VM SKU
  • Dockerfile and free publicly-hosted Docker image running .NET 7 on Ubuntu
  • Engineering north star for CY23 includes automatic publishing of weekly releases and workflow and task performance benchmarks on GitHub for max-velocity value delivery and transparency
  • MSFT engineering team has appointments at the Broad Institute including Jesus Aguilar, key architect of Terra on Azure and a multi-trillion dollar daily asset settlement system in banking, and Matt McLoughlin, architect of the Microsoft Genomics service (2015)
  • MIT License (free open-source software for any use)
  • https://github.com/microsoft/ga4gh-tes

26 of 61

3.2. TESK & TES gateway

27 of 61

TESK - TES for Kubernetes

  • Stateless wrapper around Kubernetes
  • TESK API creates Kubernetes job per task
  • Implemented using
    • Kubernetes
    • Java Spring Boot (API)
    • Python (everything else)
  • Maintained by ELIXIR Cloud & AAI but having difficulties to because of complex codebase
  • May migrate API to Python before upgrading to v1.1
  • Forked by Sophia Genetics & tested at large scale, might flow back into open source some time ⇒ may solve the language problem

28 of 61

3.3. Funnel

29 of 61

Funnel

Funnel provides a TES endpoint to a number of backends including:

  • HPC (Slurm/GridEngine/Condor)
  • AWS Batch
  • Kubernetes

Supported* Databases include:

  • BoltDB (helpful for local TES deployment)
  • MongoDB (*version update in progress!)
  • DynamoDB

Supported Object Storage:

  • S3-compatible data buckets
  • FTP, Pub/Sub
  • Local filesystem!

30 of 61

Funnel TESv1.1 Support Update

Priorities (slightly opinionated):

  • TESv1.1 compliance
    • Re-enabling support for that wide range of backends, databases, filestores
    • Web + term dashboards
    • Remote Funnel server connections
  • Easy (mostly painless?) deployments!
  • Lock-step updates with future TES versions

�Examples of the Nitty Gritty:

  • Automated compliance testing + public report page
  • Slurm update (v17.11 → 21.08)
  • MongoDB update (v3 → 6)
  • New tag filtering in web UI

31 of 61

Funnel Testing Strategy

Compliance Tests:

  • Uses AAI’s TES Compliance Suite
    • Thanks Lakshya, Alex, & Yulia!
  • Test matrix allows for combination of different TES versions with compute/storage targets defined in a single workflow file, for example:
    • TES: v1.1
    • Database: MongoDB
    • Compute engine: Slurm
  • Ran in parallel with report generated on success�

Unit Tests/Linting:

  • Ran alongside compliance tests with test suite defined in Funnel’s codebase (e.g. ./tests/core)

32 of 61

4.0. Google Summer of Code�Project Updates

33 of 61

4.1. TES compliance test suite

34 of 61

TES compliance test suite

35 of 61

OpenAPI Test Runner

  • Interoperable automated compliance testing framework
  • Customize tests yourself and execute them through OpenAPI test runner
  • Decoupled test runner and test suites
  • Supports every OpenAPI specification
    • Convert into pydantic models for validation
  • Every test is independent and sequential
  • Highly reusable and extensible
  • Easily integratable within CI pipelines (GitHub actions)
  • Offers two modes:
    • Validation mode: Validates only models, tests and templates
    • Runner mode: Executes the tests and collects the results

36 of 61

Test Schema

Endpoint details

Polling

Filtering

Validate tests using the test schema

The OpenAPI test runner supports:�

  • Request/ response body validation
  • HTTP status validation
  • Polling
  • Filtering
  • Storage & environment variables

37 of 61

Tests written for TES

  • Schema-validation based
    • GetTask (Minimal, Basic, Full)
    • ListTasks (Minimal, Basic, Full)
    • ServiceInfo, CreateTask, CancelTask
  • Polling-based (validate the task state)
    • CancelTask and CreateTask
  • Filtering
    • Filter by name
    • Filter by state and tags (v1.1.0)
  • V1.1.0
    • Backend parameters, Ignore error, streamable

Total 23 tests have been written for the TES test suite to intricately check conformance to API specifications

38 of 61

Executing the tests

–server

The server url of API instance

–version

API version

–include-tags & –exclude-tags

Run specific tests with matching tags

–test-path

Run all the tests at the given path (Directory or File)

–output-path

Store the JSON report locally at given path

39 of 61

Results for cancel_task test

�Job-1 creates a task

�Job-2 cancels the task

�All responses valid. Tests is successful.

40 of 61

Skip the test, since v1.0.0 does not support ignore_error test

Test failed as the Get Task Basic response missed a required field

41 of 61

Results summary for v1.0.0 (Skips some tests meant for v1.1.0)

Results summary for v1.1.0

42 of 61

Reports

Report Text View

Report Table View

Report JSON View

43 of 61

4.2. Web Component for TES API

44 of 61

Javed Habib (JaeAeich)

  • Cross framework and highly configurable component
  • Easy use and install via cdn’s and as npm library.

45 of 61

Implementation

  • Prioritized performance and scalability
  • Microsoft FAST for strong Design Token support ⇒ facilitates consistent branding and design

46 of 61

Components developed are:-

TES:

  • ecc-client-ga4gh-tes-service
  • ecc-client-ga4gh-tes-run
  • ecc-client-ga4gh-tes-runs
  • ecc-client-ga4gh-tes-create-run

47 of 61

Display service info for TES API

48 of 61

Hits the TES/run to get full view of the runs, info like all the stats and meta-data about the run.

Expands on click

Wraps in the bigger component called ecc-client-ga4gh-tes-runs

49 of 61

Component used to browse tasks

Filters via name_prefix (logic handled by backend currently non functional)

Filter based on status like, complete, error, system error, processing etc..

Pagination

# task per page configurable

50 of 61

Trivial/necessary fields uncollapsed by default

Other fields can be set as per requirements

Create Task runs!

51 of 61

ELIXIR Cloud Components

Javed Habib (JaeAeich)

  • Collection of reusable/rebrandable Web Components for ELIXIR Cloud services (and GA4GH APIs in general)
  • Next to the TES API, the following Web Components are under development
    • WES API
    • Service Registry API
    • TRS API
    • TRS-Filer (ELIXIR Cloud & AAI’s generic TRS implementation)
  • Components need finalization/harmonization, planned release of suite in 2024

52 of 61

5.0 Future of TES API

53 of 61

5.1. API extensions

54 of 61

API Extensions

  • Basic idea
    • Each standard MAY define a set of optional “capabilities”...
    • …which are expected to follow an API extension if supported…
    • …making them interoperable and testable
    • Can be broadcast through the /service-info endpoint…
    • …and discovered through a Service Registry API implementation
  • Issue @ TASC: https://github.com/ga4gh/TASC/issues/45(has an ad hoc suggestion of how this could look like)

55 of 61

Possible examples

  • GA4GH & other community standards
    • Support for Crypt4GH, Beacon, htsget, refget etc.
  • Data security / TRS features
    • Support for Trusted Execution Environments, multiparty homomorphic encryption, differential privacy etc.
  • Federation-related features
    • Gateway service-related: list of supported middlewares (e.g., distribution logic)

56 of 61

Discussion

  • Do we want to push for this? 👍👎
  • What other capabilities would make sense?

57 of 61

5.2. Auth/Pass

58 of 61

Auth/Pass

Current spec:

TES does not accept credentials with inputs/outputs

Current implementations: Global secrets to access in/out locations stored at the backend. Good for a private TES server.

59 of 61

Auth/Pass

Sending passports with TES payload:

How can a passport translate to credentials to a URL?

60 of 61

Auth/Pass

  • Sending passports with TES payload
  • Inputs as DRS URLs
  • Is TES a right layer to resolve DRS URLs?
  • What about tmp inputs and outputs? Can DRS be used to create such objects?

61 of 61

5.3. Other needs?��Brainstorming session