1 of 32

Cloud Work Stream

GA4GH Connect 2021

Chairs: Brian O’Connor, David Glazer

ga4gh.org

2 of 32

Welcome!

  • Agenda: https://bit.ly/3q3uO2X

  • Driver project check-in
    • How are the Cloud APIs being used for your project?
    • What improvements do you need to see in order to adopt?

  • API Champion check-in and planning:
    • What are the major 1-3 issues you want to address between now and plenary?
    • Which will be addressed during Connect?

  • Summarize our goals for the week
    • Review all Cloud-relevant GA4GH Connect sessions

2

ga4gh.org

3 of 32

Driver Project Check-in

ga4gh.org

4 of 32

ICGC-ARGO

Key Objectives: Automated execution of workflows at scale (millions of executions) across geographically distributed processing centers.

  • should be horizontally scalable, allowing for efficient parallelization while utilizing as much of the allocated underlying hardware as possible
  • should be resilient, minimizing halting errors which may result from bad data, network connectivity issues, and other such factors
  • should be easily reproducible for the various partners that will operate a Regional Data Processing Center (RDPC), allowing them to get up and running quickly without the need for specialized hardware (we run a Kubernetes cluster in Cancer Collaboratory - an OpenStack cloud maintained by the OICR)
  • should be flexible enough to allow for changes as requirements and technology mature

4

ga4gh.org

5 of 32

ICGC-ARGO

Cloud APIs in Use

  • WES - Standard REST API is available to interact with our WES system, currently supporting the Nextflow engine only but architecture supports relatively easy extension to CWL, WDL

Cloud APIs Under Consideration

  • TRS - We have a need for a central workflow registry, similar to Dockstore but they key feature missing is some sort of workflow parameter schema registration
    • workflows are at their best when they can be described as a blackbox to be called with some input schema, this will be mandatory for to minimize an entire class of errors when utilized in automated systems
    • with an input schema, the underlying workflow engine should no longer be of consequence for the end user so long as there is a matching implementation supported by the WES the user has access to

5

ga4gh.org

6 of 32

ICGC-ARGO

WES Usage Notes

  • The standard list response at /runs does not give us enough information
    • not sure that there is a universally acceptable minimum field response but for us runId and state are insufficient
  • The run state model has proven to be robust enough to meet our needs for the past year and into the future as we continue to develop our tools
    • we may need additional intermediary states between QUEUED and ACTIVATING to support user defined middleware
  • We seldom need just run information, often times we want to know what input data was used in the run, and what output data was produced by the run
    • we manage our object storage data and it’s associated metadata with our open-source tools, SONG and SCORE, part of Overture.bio
    • this desire to join two different data-source into a single query led us to develop a GraphQL API modeled after the WES API

6

ga4gh.org

7 of 32

ICGC-ARGO

WES GraphQL - The WES Data Model, but using GraphQL instead of REST

  • In order for an automated system to make decisions about whether or not run a workflow it must have information from more than just the workflow domain, we also need information from the data domain
  • With REST this would mean with multiple queries across various APIs at decision time, or maintaining join tables and services that are very specific and often not portable across organizations
  • We developed a WES GraphQL API that follows the same data model as the transitional REST API, and in addition to the ability of crafting responses with exactly the fields we want, we are able to join this GraphQL API with a similar service running against our data registry (SONG/SCORE), giving us the ability to look up the produced data, and the input data for any given run, thereby building a provenance graph

7

ga4gh.org

8 of 32

ICGC-ARGO

WES GraphQL Example

8

ga4gh.org

9 of 32

Genomics England

  1. Orchestration: we are decoupling sample orchestration and workflow orchestration. We are working on how to best use the GA4GH WES/TES standards to support workflow orchestration.
  2. TES: we have a few “interpretation services” that are very similar in spirit to what a TES would support. We have used TES to implement a few of these (also particular steps in the workflows in 1) and it has worked well. Not yet in prod, still in prototype.
  3. Research Environments: We have GA4GH cloud standards in our roadmap for the cloud based research environments commencing on Q4-2021. We have an env in AWS running Nexflow and in mid March Cromwell.
  4. Private buckets as a very common use case

9

ga4gh.org

10 of 32

DP Update: ELIXIR Cloud

Cloud WS meeting @ GA4GH Connect 2021

The ELIXIR Cloud & AAI Initiative

Alex Kanitz (CH), Álvaro González (FI), Jonathan Tedds (Hub) &�& ELIXIR Compute Platform

ELIXIR-CH, -CZ, -DE, -FI, -GR, -IT, Hub & EMBL-EBI

github.com/elixir-cloud-aai

11 of 32

Who we are

  • GA4GH Driver Project: Dedicated to implementing & promoting GA4GH Cloud standards
  • Project helmed by ELIXIR Compute Platform but cross-ELIXIR Platform/Community effort
  • GitHub org with 20+ repos & Slack space with 120+ members

”Enable federated environments to deliver large scale workflow analysis across international boundaries”

12 of 32

What we do

Represent ELIXIR stakeholders�in GA4GH & promote GA4GH standards within ELIXIR

Prototype real-world use cases with ELIXIR stakeholders, develop PoCs & deploy at ELIXIR nodes

Consult on integrating GA4GH standards into existing solutions and provide technical support

Interoperability testing with third party GA4GH-powered solutions

13 of 32

Current state of integration

  • Guidelines for token flow developed with ELIXIR AAI, aligning with GA4GH (Passport scope currently not used)
  • Helm charts for deployment on cloud native infrastructure

�Current�deployments:

1

2

3

4

5

6

7

8

9

10

11

14 of 32

Selected achievements 2020

  • Used all four GA4GH Cloud APIs together in FASP demonstrator at 8th GA4GH Plenary press release
  • Started development of centralized service registry for ELIXIR-based GA4GH (Cloud) services repo; currently being deployed in ELIXIR-CH
  • Co-led TES v1.0 standard (Ania Niewelska, EMBL-EBI) to be officially adopted press release coming soon
  • Implemented experimental client-side TES support in Snakemake Mölder et al.
  • First successful interoperability tests with Cavatica & DNAstack components GitHub issue

Current & future work to be presented during Cloud WS call on Mar 8th!

15 of 32

What do we need from GA4GH Cloud APIs? (I)

  • Controlled access to data and compute(!) via Passport Visas
    • Possibly multiple options, depending on desired level of security
    • Ideally fully compliant with or extending relevant standards (OIDC, OAuth2); might need engagement of broader tech community
    • Retrieve up-t0-date lists of healthy WES/TES deployments that user trusts and is permitted to use from service registry
  • Better support for federated computing / bringing compute to data
    • Objective: balance runtime vs costs & minimize data movement
    • Params: data security/transfer constraints; precise location of data & compute; bandwidth between data & compute; max execution time; compute, storage & transfer cost model; user’s run priority (cost vs runtime)
    • Proposal: /runs/info (WES) & /tasks/info (TES) endpoints with runtime/cost predictions
    • PoC: limited, but working TES-based task distribution logic & required spec changes in our TEStribute repo
    • Other considerations: cost accounting/transfer?

with DURI/ Data Security / Discovery / FASP

16 of 32

What do we need from GA4GH Cloud APIs? (II)

  • Specify status callbacks & necessary endpoints (TES > client & WES > client)
  • Harmonization of (Cloud) APIs
    • Better abstraction & reuse of schemas/models, especially between DRS/TRS & WES/TES, but also more general (e.g., common pagination)
    • Improved consistency across GA4GH (Cloud) APIs; required review of every PR by single person with regard to consistency feasible (e.g., CSO)?
    • Harmonize and, if necessary, re-think DRS & TRS URIs (TRS URIs need to be versioned)
    • Uniform & improved error responses
  • Enable (more) convenient handling of sensitive data
    • Data access
    • Container security
  • Increase awareness & adoption of TES
    • Engagement of compute infrastructure (e.g., cloud providers, queueing systems) & workflow management system developer communities

with Data Security / FASP

with TASC / FASP

17 of 32

Questions /

Discussion

Thank you!

Work with us: GitHub org, Slack

18 of 32

TOPMed (BioData Catalyst)

How are the Cloud APIs being used for your project?

18

https://biodatacatalyst.nhlbi.nih.gov

  • Auth
  • Data Access
  • Metadata & Search
  • Compute Workflows

ga4gh.org

19 of 32

TOPMed (BioData Catalyst)

How are the Cloud APIs being used for your project?

  • Auth
    • GA4GH Passports via NIH RAS
  • Data Access
    • GA4GH DRS 1.1 on Gen3 with Clients on Terra and SBG
  • Metadata & Search
    • PIC-SURE, Gen3 GraphQL, FHIR, etc… not GA4GH Search currently
  • Compute Workflows
    • GA4GH TRS on SBG and Terra
    • WDL and & CWL workflows, notebooks as well
    • GA4GH WES on SBG

19

ga4gh.org

20 of 32

TOPMed (BioData Catalyst)

What improvements do you need to see in order to adopt?�

    • How Passports and DRS interact?�
    • How Passports and WES interact?�
    • DRS continue to evolve to accommodate, for example, imaging data (requires bundles with paging for example), extended metadata, etc�
    • TRS to support "apps" in addition to workflows

20

ga4gh.org

21 of 32

Other Driver Projects and Implementers?

21

ga4gh.org

22 of 32

WES: State of WES Implementation

ga4gh.org

23 of 32

WES

  1. Data Access Credentials (touches on DRS x WES integration)
    1. https://github.com/ga4gh/workflow-execution-service-schemas/issues/18. This conversation has led to a doc for a proposal.
  2. Managing WES outputs
  3. Structured usage of "workflow_params" and "workflow_attachments”

23

ga4gh.org

24 of 32

TRS

ga4gh.org

25 of 32

TRS

1) what are the major 1-3 issues you want to address between now and Plenary and

2) which will be addressed in Connect (breakouts either official or ad hoc).

  • representation of new information in TRS driven by Dockstore's use of TRS such as signing workflows/containers, services�
  • potentially tackling some of the data security ideas that came up in the workflow security discussion�
  • housekeeping in the sense of addressing feedback from implementers and users of the standard but also keeping up with openapi3, pagination, etc.

25

ga4gh.org

26 of 32

TES

ga4gh.org

27 of 32

TES

1) What are the major 1-3 issues you want to address between now and Plenary?�

  • Roadmap
    1. Introduce status change callbacks (alternative to polling for task completion) #121
    2. Contribute to cloud toolkit and documentation as part of "deployment kit" activities
    3. Flagging filtering stored tasks #104
    4. File Storage Access Credential; Token Handling - 2021 optimistic; need to align with DRS, WES�
  • Other priority tickets
    • Backend-specific resource definitions #127

27

ga4gh.org

28 of 32

DRS

ga4gh.org

29 of 32

DRS

1) What are the major 1-3 issues you want to address between now and Plenary?�

  • Highest priority tickets
    1. Passports + DRS (#339)
    2. Scaling (#342) - Brings together multiple tickets including:
      1. DRS paging #325
      2. DRS bulk requests #334
    3. Discovery + DRS (#228)�
  • Other priority tickets
    • Metadata + DRS (#336)
    • Client efforts, libraries, OpenAPI work, registry, etc (#267, #230)�
  • See our DRS project board

29

ga4gh.org

30 of 32

DRS

2) Which issues will be addressed in Connect (breakouts either official or ad hoc)?

DRS + Passports - March 3rd @ 13:30 UTC (with DURI WS)

DRS Alignment with Beacon and Search - March 3rd @ 22:30 UTC (with Discovery WS)

30

ga4gh.org

31 of 32

Goals for the week

ga4gh.org

32 of 32

Cloud Goals for the Week

32

Connect Goals:

  1. Prioritize API spec enhancements for 2021 based on Driver Project need
  2. Discuss our high-profile issues and translate into open pull requests

Cloud WS - March 1st @ 21:00 UTC

  • Driver projects weigh in on what new features are needed for our API specifications

Key Management in the Cloud - March 2nd @ 21:00 UTC (with Large Scale Genomics WS)

  • Secure management of encryption keys to encrypt/decrypt Crypt4GH files stored in the cloud

DRS + Passports - March 3rd @ 13:30 UTC (with DURI WS)

  • Formalize the token handoff process between researcher, passport broker, and DRS service for standardized, secure access to controlled DRS datasets

DRS Alignment with Beacon and Search - March 3rd @ 22:30 UTC (with Discovery WS)

  • Drive collaboration with other standards teams to coordinate on terminology and tools
  • Harmonize metadata models for standardized searching of DRS objects through Beacon V2 and Search

ga4gh.org