1 of 22

Tour of a WDL workflow

Stephanie Gogarten

Simulation and Benchmarking sub-WG

Sept 21, 2022

2 of 22

Outline

Where to learn WDL

Example workflow: submit job to imputation server

Docker

Dockstore

Advanced WDL: parallel jobs with scatter()

3 of 22

Where to learn WDL

“Learn WDL” videos

4 of 22

Anatomy of a workflow

  1. WDL version
  2. Workflow name
  3. Workflow inputs
  4. Call to task
  5. Task inputs
  6. Workflow outputs
  7. Author information

5 of 22

Workflow inputs

6 of 22

Calling a task

Workflow inputs map to task inputs

In this case there is only one task, but with multiple tasks each can have a different subset of inputs

7 of 22

Inside the task

  1. Task inputs
  2. Command block
  3. Task outputs
  4. Runtime (instructions for compute environment)

8 of 22

Task inputs

9 of 22

Doing stuff: the command block

Shell commands to be executed at runtime

Input references are enclosed in ${}

Transform inputs with arguments inside brackets (e.g. paste with separator)

10 of 22

Task output

Output type options are same as input: e.g. string, boolean, File

This example reads a temporary file to return a string

11 of 22

Workflow output

Output of workflow references task output

Displayed in AnVIL job manager:

Task name

12 of 22

Defining the compute environment

The “runtime” block specifies the docker image in which the command is executed

The docker image must contain all software needed to run the command

13 of 22

What is Docker?

  • Custom compute environment that can be run anywhere
  • A Docker image is:
    • built containing all software necessary to run an application
      • Usually built from a base image (e.g., ubuntu)
      • Typically composed of multiple layers
    • a read-only template with instructions for creating a Docker container
  • A Docker container is:
    • a runnable instance of an image on a local or host computer
    • what the image becomes in memory when executed

13

14 of 22

Creating a docker image

  1. Base image
  2. Install curl and Java
  3. Download imputationbot software

Save in a “Dockerfile”

Build image locally

Push to repository

15 of 22

Maintaining code on GitHub

Dockerfile

WDL file

Sync with Dockstore

Example inputs

16 of 22

Publishing the workflow on Dockstore

.dockstore.yml file

  1. Name of workflow
  2. Path to WDL
  3. Path to example JSON

Follow instructions linked from PRIMED website to connect your GitHub repository to Dockstore

17 of 22

PRIMED WDL collections

18 of 22

Workflow page on dockstore

Workflow must be published to see this page

Import workflow to an AnVIL workspace

19 of 22

Workflow on AnVIL

Edit example JSON and upload to AnVIL

20 of 22

Example JSON

21 of 22

Advanced WDL: scatter()

Example of a workflow with two levels of nested parallelization: over chromosomes and over populations

The output of a scattered task is an array

More info here

22 of 22

Questions?

#anvil channel on PRIMED Slack