1 of 24

Software virtualization:

Part 2: BIDS Apps and

BIDS Apps on the Science Cloud

March 2, 2017

Franz Liem

2 of 24

This talk is going to introduce BIDS Apps (Brain Imaging Data Structure Apps), a convenient way of distributing neuroimaging software in software containers.

BIDS Apps rely on the recently proposed Brain Imaging Data Structure, a way of organizing files from brain imaging studies.

We will demonstrate how to run BIDS Apps on your laptop, as well as on UZH’s Science Cloud.

3 of 24

Have you ever...

… tried to re-run your analyses on a new computer that didn’t have all the right software installed? And failed miserably?

… tried to re-run analyses performed by a former lab member that now works someplace else? And failed miserably?

… tried to install and run some lab’s cool but tricky-to-install software? And failed miserably?

4 of 24

Reproducibility

  • Science that is not replicable is a waste of everyone's time and money

  • I should be able to re-run my one-year-old analysis on the same data and be able to produce the same results
  • I should be able to run my analysis on new data
  • Other scientists should be able to run my analysis on their data (and vice versa).

5 of 24

Reproducibility

  • Having code available to run a previous analysis and being able to run a previous analysis on my own data are not the same thing.
  • Software installation can be painful
  • Different software versions can yield different results (Gronenschild et al., 2012)

6 of 24

BIDS Apps

PLOS Computational Biology

Preprint: http://biorxiv.org/content/early/2017/01/29/079145

7 of 24

BIDS Apps

  • Portable neuroimaging pipelines
  • Shipped as Docker software-containers (more lightweight than VM)
    • For a Docker intro see see neurohackweek.github.io/docker-for-scientists
  • Developed by different labs all over the world
  • Spearheaded by Chris Gorgolewski (Stanford Center for Reproducible Neuroscience)

8 of 24

9 of 24

BIDS Apps

  • Over 20 Apps available or under lively development
  • http://bids-apps.neuroimaging.io/
  • Containers can be downloaded from hub.docker.com/r/bids/
  • BIDS Apps take BIDS datasets as input

10 of 24

Brain Imaging Data Structure (BIDS)

  • "A simple and intuitive way to organize and describe your neuroimaging and behavioral data."
  • → a way to name and organize files from neuroimaging studies
  • Meta data to more fully describe data
  • Gorgolewski et al. (2016)
  • http://bids.neuroimaging.io/

11 of 24

BIDS Apps - design principles

  • Portable (Plug-and-play)
    • Runs on systems that run docker
    • To process your data, you only need to specify
      • BIDS App
      • Input folder (BIDS formatted)
      • Output folder
      • App-specific (optional) inputs
  • White-Box
    • You can always take a look under the hood

12 of 24

INPUT

DATA

OUTPUT

DATA

Gorgolewski et al. (2017)

13 of 24

An example

Let’s assume you want to analyze your data with a nipype workflow provided by another lab.

The pipeline requires having installations of:

  • NIPYPE
  • FSL
  • FREESURFER
  • ANTS

14 of 24

Without BIDS App:

You need to install:

  • Python
  • NIPYPE
  • FSL
  • FREESURFER
  • ANTS

You need to wrangle your data into an input format that the pipeline understands.

Using nipype you can easily run the pipeline on multiple cores on your computer, but what if you want to run it on the (Science) Cloud?

If the other lab provides a BIDS App

You need to:

  • Install Docker
  • [Download the container: �docker pull <app_name>]

Run the entire analysis with one command.

Once your data set is formatted according to BIDS standard, you can run any BIDS App.

Using bidswrapps you can run your analysis in a distributed fashion on the Science Cloud.

15 of 24

Running BIDS Apps

github.com/bids-apps/tracula

surfer.nmr.mgh.harvard.edu/fswiki/Tracula

16 of 24

Running BIDS Apps

  1. Format the input data according to BIDS

17 of 24

Running BIDS Apps

2. Run participant level:

docker run -ti --rm \� -v /data/sourcedata:/bids_dataset:ro \� -v /data/derivates/tracula:/outputs \� -v /data/derivates/fs:/fs \� bids/tracula \� /bids_dataset /outputs participant � --license_key "XXXXXXXX" \� --freesurfer_dir /fs

18 of 24

Running BIDS Apps

2. Run participant level:

docker run -ti --rm \� -v /data/sourcedata:/bids_dataset:ro \� -v /data/derivates/tracula:/outputs \� -v /data/derivates/fs:/fs \� bids/tracula \� /bids_dataset /outputs participant � --license_key "XXXXXXXX" \� --freesurfer_dir /fs

You don’t even have to download the Docker container - it is pulled automatically the first time you run this command

Mount local folders on docker container

Name of BIDS App

Specify input/output data & which level to run

Additional App-specific arguments

Tells Docker to run a container

19 of 24

Running BIDS Apps

3. Run group level:

docker run -ti --rm \� -v /data/sourcedata:/bids_dataset:ro \� -v /data/derivates/tracula:/outputs \� -v /data/derivates/fs:/fs \� bids/tracula \� /bids_dataset /outputs group1 � --license_key "XXXXXXXX" \� --freesurfer_dir /fs

Mount local folders on docker container

Name of BIDS App

Additional App-specific arguments

Tells Docker to run a container

Specify input/output data & which level to run

20 of 24

Running BIDS Apps on the Science Cloud

UZH’s Science Cloud: www.s3it.uzh.ch/infrastructure/sciencecloud/

21 of 24

Running BIDS Apps on the Science Cloud

bidswrapps (github.com/fliem/bidswrapps)

22 of 24

Running BIDS Apps on the Science Cloud

bidswrapps

  • Running BIDS Apps on UZH’s Science Cloud
  • Thin wrapper around S3IT’s gc3pie (https://github.com/uzh/gc3pie)
  • Participant level: runs each participant as separate job (in parallel)

23 of 24

Running BIDS Apps on the Science Cloud

Participant level:

bidswrapps_start.py \�bids/tracula \�/project/sourcedata /project/derivates/tracula \�participant \

--volume /project/derivates/fs:/data/fs \�-ra "--license_key XXXXX --freesurfer_dir /data/fs" \�-s tracula.session -o /project/logfiles

24 of 24

Summary

  • BIDS Apps are Docker software containers with neuroimaging analysis pipelines
  • Once your data is BIDS formatted, you can run all BIDS Apps
  • To analyse your data, you need to specify
    • BIDS App
    • Input folder
    • Output folder
  • BIDS Apps can be run on Science Cloud via bidswrapps