Travelling with warp-speed
Marius van den Beek, John Chilton, Nicola Soranzo and the Galaxy Team
Slides @ bit.ly/gxworkflows2018
-- an update on Galaxy Workflows
Workflows
Linear progression of analysis steps
Workflows
Workflows
Linear progression of analysis steps
Store Tool Parameters
Workflows
Workflows
Communicate intent
Workflows
Workflows
Communicate intent
Enable cooperation
Workflows
Workflows
Workflows
Create from history
Create from Scratch in Workflow Editor
Input + Parameters + Workflow = Output
Workflows
Workflows
Create from history
Create from Scratch in Workflow Editor
Input + Parameters + Workflow = Output
A history of Workflows in Galaxy
Workflow Editor ~ 11 yo
Extract from history ~ 11 yo
Workflow in Tool panel ~ 11 yo
Input modules ~ 11 yo
Post Job Actions ~ 9 yo
Collections in Workflows ~ 5 yo
Modularize workflows w/ subworkflows
https://usegalaxy.org/u/marius/w/parent-workflow-chipseq
Modularize workflows w/ subworkflows
Subworkflows vs workflow spaghetti
17.09 - Re-run and Replace
18.01 - Workflow Post Job Actions
The standard set of dataset post job actions (hide, tag, delete, rename) all work very intuitively with collections... finally.
18.01 - Workflow Post Job Actions
The standard set of dataset post job actions (hide, tag, delete, rename) all work very intuitively with collections... finally.
18.01 - Scaling Job Cache
Find and re-use the output of jobs with identical combinations of input files and parameters, which, assuming deterministic output, should produce the same result.
18.01 - Scaling Job Cache
Enables running overlapping workflows without overhead
Quality Control Workflow
1
2
3
18.01 - Scaling Job Cache
Enables running overlapping workflows without overhead
Analysis Workflow
1
2
3
18.01/18.05 - Clone tools with settings
18.09 - Switch tool versions in editor
18.09 - Zoom in workflow editor
18.09 - Runtime parameters for Subworkflows
Future plans
Enable more complex analyses, such as branch points, computed parameters
Future plans
Future plans
Visual feedback about Workflow Progress
Future plans
Export Workflows, Parameters, Inputs, Runtime details, etc. into Research Objects (RO)
Workflow Tooling - CWL
Most (79 / 133) tests
pass in a fork.
Many workflow and collections
enhancements came from that branch, many more to come.
Workflow Tooling - Format 2
Workflow Tooling - Testing
http://bit.ly/gxwftests
Workflow Tooling - Testing
Planemo For Workflows
$ planemo serve <workflow.(ga|yml)>
pre-18.01�$ planemo test <workflow.(ga|yml)>
18.05+�$ planemo convert <workflow.(ga|yml)>
19.01�$ planemo workflow_edit <workflow.(ga|yml)>
19.01
Rules for Building!
Created arbitrarily nested collections in the API.
Great for organizing data from instrument sample sheets, spreadsheets with data source links, or FTP directories into collections.
Rules for Manipulating!
Embed rule editor right into the tool form.
Load collection elements as rows, metadata as columns.
Filter, re-group, sort existing collections interactively or as part of workflows.
Structure and restructure data as needed for different parts of an analysis.
Rules - Accessible from the Start
Huge thanks to Helena Rasche�for the detailed PR review and edits!
Rules for Workflows - The Problem
Saskia Hiltemann et. al. at Erasmus MC are working with an outside diagnostics lab (Streeklab Haarlem), pipelines must be easy and foolproof.
�Latest project - paired-end sequencing on two loci (that each encode for one part of HLA-DQ protein complex, HLADQ-A & HLADQ-B) for each patient.
Originally made a workflow with 4 separate inputs (gene A forward and reverse, gene B forward and reverse). Suboptimal because:
Rules for Workflows - The Solution
Group tags for complex analysis
Select datasets from nested collection allows multi-factor:
~ batch + condition
https://github.com/galaxyproject/tools-iuc/pull/2167
https://github.com/galaxyproject/galaxy/pull/5457
Re-usable workflow parameters
Thanks!
Thanks to the whole Galaxy community for building awesome stuff with workflows.