Codefest Report
July 20-21, 2017
www.open-bio.org/wiki/Codefest_2017
What is Codefest?
How does Codefest work?
Thank you
Themes from the Codefest
Table column ordering
Specify where columns should be placed in In modules, config and report
Phil Ewels, Rickard Hammarén, Robin Andeer, Tim Booth, Dennis Schwartz, Dimitri Desvillechabrol, Amandine Perrin, Rowland Mosbergen, Murray Wham, Markus Ankenbrand, Raony Guimaraes, Tom Walsh
Pull requests
Issues
Scout integration
MultiQC reports embedded within Scout clinical genomics browser
Datatable by Musavvir Ahmed, group by Mello, help by i cons, Branch by Stanislav Levin from the Noun Project
New modules!
VCFtools, nonpareil, bcl2fastq(!), AfterQC
Module grouping
Just run modules related to a specific data type with new module tags.
Module help texts
New drop-down texts above plots in reports to describe what’s being shown
Collect software versions
Core MultiQC support for scraping software versions from logs
18 opened
9 opened
14 merged
5 closed
,
Snakemake, and "other Python things"
Members: Wibowo Arindrarto, Kai Blin, Spencer Bliven, Christian Brueffer, Peter Cock, Thomas Cokelaer, Joe Greener,
Seqanix GUI in PyQt for Snakemake pipelines; http://sequana.readthedocs.io
Protein structure
Members: Spencer Bliven, Alexander Rose, David Sehnal, Joe Greener
Protein structure
{ kind:"isCloseTo",
args: [
{ kind: "chains",
options: ["A"] },
{ kind: "residues",
options: ["HEM"] }
],
options: { maxDistance: 5 }
}
https://github.com/MolQL/molql
Jmol: select within(5, [HEM]:A)
isCloseTo(
chains('A'),
residues('HEM'),
5)
PyMol: select chain A within 5 of resn HEM
User query
Abstract Syntax Tree
MolQL Interchange Format
Query Result
Exploring new uses and executor APIs
Workflows including Apache Spark based analyses
Reproducible software deployment
Provenance
Farah Zaib Khan, Stian Soiland-Reyes, Tazro Inutano Ohta
PROV
{"@context": { "@base": "app://2e1287e0-6dfb-11e7-8acf-0242ac11000/" },
"@id": "workflow/master-job.json#",
"@type": "WorkflowRun",
"workflow": "workflow/packed.cwl#main",
"inputs": [
{"@id": "data/5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03",
"describedByParameter": "workflow/packed.cwl#main/in1"} ],
"outputs": [
{"@id": "data/00688350913f2f292943a274b57019d58889eda272370af261c84e78e204743c",
"describedByParameter": "workflow/packed.cwl#main/in1" } ],
"steps": [
{"@id": "urn:uuid:4305467e-6dfb-11e7-885d-0242ac110002",
"@type": "ProcessRun",
"step": "workflow/packed.cwl#main/step1"},
{"@id": "urn:uuid:c42dc36e-6dfd-11e7-bc24-0242ac110002",
"@type": "ProcessRun",
"step": "workflow/packed.cwl#main/step2"}
]
}
Singularity support in CWL
github.com/johnfonner/cwltool/tree/feature-singularity
Members: Isak Sylvin, John Fonner
+ =
Rabix Suite
CWLToil dynamic ResourceReqs
First Goal: Calculate resource requirements based on input files: number, sizes, and other metadata.
https://github.com/BD2KGenomics/toil/pull/1767
https://github.com/common-workflow-language/cwltool/issues/483
Final Goal: Calculate (computational || economic) costs before running a job on Toil/CWL, based on cores, input file sizes, memory, etc...
Michael Crusoe and Roman Valls Guimera
CWL SDKs 🛠
Members: Niall Beard, Kenzo Hillion, Hervé Ménager, Anton Khodak, Denis Yuen, Luka Stojanovic, Heather Wiencko with help from Ivan Batić, Maja Nedeljković, Michael Crusoe and Peter Amstutz
The Open Bioinformatics Community
Photo by Ntino Krampis: https://www.open-bio.org/wiki/Codefest_2017#Outcomes
Join us