1 of 28

Workflows, Planemo, BioBlend and tags to automate SARS-CoV-2 genome surveillance

Wolfgang Maier

Galaxy Europe Team

University of Freiburg, Germany

2022-09-29

2 of 28

The Problem

3 of 28

tiled amplicons

amplified viral cDNA

sequencing reads of amplicons

SARS-CoV-2 lineage stats

data processing steps

4 of 28

tiled amplicons

amplified viral cDNA

sequencing reads of amplicons

SARS-CoV-2 lineage stats

data processing steps

5 of 28

data processing steps

mapping

variant calling (mutations)

primer trimming

consensus building

lineage assignment (with pangolin or nextclade)

6 of 28

tiled amplicons

amplified viral cDNA

sequencing reads of amplicons

SARS-CoV-2 lineage stats

information content

data processing steps

!

risk of fragmentation /

poor comparability

7 of 28

The Solution

8 of 28

Illumina WGS

Illumina ARTIC

ONT ARTIC

mutations in standard VCF format with rich call statistics and annotations

Reporting

Consensus building

9 of 28

bwa-mem

lofreq

snpEff (covid-19 release)

bwa-mem

lofreq

snpEff (covid-19 release)

ivar

mapping

variant calling

variant annotation

primer trimming

minimap2

medaka

snpEff (covid-19 release)

covid19.galaxyproject.org variation analysis workflows

10 of 28

Illumina WGS

Illumina ARTIC

ONT ARTIC

mutations in standard VCF format with rich call statistics and annotations

Reporting

Consensus building

11 of 28

version-controlled workflows with defined releases

12 of 28

version-controlled workflows with defined releases

13 of 28

version-controlled workflows with defined releases

14 of 28

version-controlled workflows with defined releases

15 of 28

version-controlled workflows with defined releases

16 of 28

A new problem: lots of WFs to run

17 of 28

Solution:

Automation through the API

18 of 28

https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/workflow-automation/tutorial.html

19 of 28

https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/workflow-automation/tutorial.html

20 of 28

use bioblend to fill the template and create the final job.yml file for planemo run

21 of 28

record states,

identify histories

22 of 28

Workflow run automation via the Galaxy API

23 of 28

VCF

Reports

Consensus FASTA

Downstream variant analysis/providers

Direct data exploration through tabular datasets and plots

nextstrain, pangolin�GISAID�Genome surveillance initiatives

24 of 28

A public archive of

usegalaxy.*

Covid-19 analysis efforts

ftp://xfer13.crg.eu/

25 of 28

full access to ~2,000 analysis batches on ~ 400,000 SARS-CoV-2 samples

26 of 28

What’s next?

SARS-CoV-X

Influenza

West Nile

MERS

Nipah

Monkeypox

Ebola

Lassa

?

  • maintain existing workflows
  • adapt workflows to other pathogens of concern
  • maintain infrastructure, frameworks and services

(invest in these things if you don’t have them yet)

27 of 28

Anton Nekrutenko

Björn Grüning,

Simon Bray,

Nathan Roach,

Marius van den Beek,

Dannon Baker

Sergei Pond,

Ulvi Talas, Peter van Heusden,

Babita Singh, Mauricio Moldes

28 of 28