1 of 10

Dynamic Job Mapping in Galaxy Simplified

Eric Enns

Public Health Agency of Canada

Presented at GCC2015

2 of 10

3 of 10

OR

4 of 10

5 of 10

6 of 10

jobConfig.py

  • Provides configure_job function
  • Maps jobs to destinations or adjusts params or induces failures via conditions
    • filesize (lbound, hbound)
    • records (lbound, hbound)
  • Maps jobs to “dynamic_default” if no rule defined

7 of 10

Job_conf.xml

<destinations default="dynamic_job"/>

<destination id="dynamic_job" runner="dynamic">

<param id="type">python</param>

<param id="function">configure_job</param>�</destination>

<destination id=”dynamic_default” runner=”drmaa”>

</destinations>

8 of 10

job_options.xml

<j

<condition

type="filesize"

lbound="4 GB"

hbound="-1"

err_msg="Too much data, shouldn't run"

param="fail" />

<jobs>

<job name=”spades”>

<condition type=”filesize” lbound=”0” hbound=”10 MB”

err_msg=”Too few reads for spades to work” param=”fail” />

<condition type="filesize" lbound="4 GB" hbound="-1"

err_msg="Too much data, shouldn't run" param="fail" />

<condition type="filesize" lbound="10 MB" hbound="2 GB"

param="-q test.q -pe galaxy 24 -l h_vmem=15G" />

<condition type="filesize" lbound="2 GB" hbound="4 GB"

param="-q test.q -pe galaxy 48 -l h_vmem=96G" />

</job>�</jobs>

9 of 10

Future Work

  • Integrate auto re-run of failed jobs due to resource problems
    • i.e. not enough memory
  • Add more condition types
    • i.e. command line arguments set

10 of 10

Acknowledgements

Daniel Bouchard

Philip Mabon