1 of 8

BOSC 2022 COFEST!

Status updates / Project reporting

July 2022

2 of 8

OBF Newsletter + blog post

GOAL

Assemble the post-BOSC issue of the OBF newsletter (which goes out through the bosc-announce mailing list) and a blog post

STATUS

  • Created issue calling for community input (basic newsletter process)
  • Created newsletter file from template on dedicated branch
  • Created Google doc to collect contributions about BOSC

NEED !!!

More BOSC participants to contribute your favorite moments/takeaways in the Google doc

NEXT STEPS

Flesh out newsletter and blog text based on group input

Project Proposer:

Geraldine Van der Auwera

Stakeholders (if any):

All of OBF/BOSC

Interested developers:

Everyone can contribute a short note calling out highlights of BOSC

3 of 8

Adding GA4GH TES on top of Arvados Crunch

Status: proof of concept able to start a container, but ran into problems collecting output. Peter is now in contact with Kyle Ellrott & ELIXIR experts about GA4GH TES

https://github.com/arvados/arvados-tes

https://dev.arvados.org/issues/19272

Goal: Implement a GA4GH TES endpoint on top of the Arvados Crunch compute system. Test with the S3 endpoint of Arvados Keep.

Possible users of this include:

  • Nextflow
  • Snakemake
  • Cromwell

Project Proposer:

Peter Amstutz

Stakeholders (if any):

GA4GH, workflow users, other workflow engines with a TES backed

Interested developers:

Michael R. Crusoe

Tazro Ohta

[Add your name here]

4 of 8

EDAM ontology, optionally also Bio.tools

  • Fixing BLAST output formats in EDAM (finally)💥 (Matúš, Tom)
  • Reviewed a few PRs on the EDAM Ontology repo (Hervé)
  • Discussed the bio.tools-EDAM-bioToolsSchema-BioSchemas etc metadata landscape (Bhavesh, Hervé)

Project Proposer:

Matúš Kalaš

Participants:

Matúš Kalaš

Tom Madden

Hervé Ménager

Bhavesh Patel

Slack: #2022-cofest-edam

5 of 8

NCBI Datasets, a new way to access genome data at NCBI

NCBI Datasets is a new resource for accessing genome sequence and metadata directly from NCBI. This data can be retrieved through our new web interfaces, command line tools and OPEN API. This resource has been built with the help of our users. We invite the BOSC community to try Datasets and give feedback.

Please join us to try out Datasets

  • Try out our new command line tools for downloading data and working with metadata files
  • Explore our new web pages
  • Help improve our documentation
  • Provide feedback to help us improve the Datasets resource

Resources

Please contact me if there is a Datasets related project you want to explore!

Project Proposer:

Nuala O’Leary

olearyna@ncbi.nlm.nih.gov

Stakeholders:

Interested Participants:

6 of 8

FAIRshare, a Tool for Making Biomedical Research Data and Software FAIR

FAIRshare is an open-source and free cross-platform desktop software that combines intuitive user interface and automation for streamlining the process of making biomedical research data and software Findable, Accessible, Interoperable, and Reusable (FAIR). In our current phase of development, we have implemented a workflow for making biomedical research software FAIR according to the guidelines established by our team called the FAIR-BioRS guidelines. We are now working on implementing workflows for making infectious disease related research data (immunology, genomics, etc.) FAIR according to applicable guidelines.

Cofest outcomes: We got some great inputs from Michael Crusoe, Hervé Ménager, and Hilmar Lapp about improving the FAIR-BioRS guidelines. Michael has also submitted a PR to include recommendations for documenting CLIs in the FAIR-BioRS guidelines. Hilmar is providing suggestions for publishing the guidelines and we may collaborate on the manuscript. We will keep exchanging with them! Tazro Ohta tested FAIRshare with one of his software located on GitHub and provided very useful feedback as well. All in all, this was a wonderful opportunity for us to get our work in front of experts!

Resources

Project Proposer:

Bhavesh Patel (bpatel@calmi2.org),

Sanjay Soundarajan (ssoundarajan@calmi2.org)

Stakeholders:

Interested Participants:

Laura Gorrell (lgorrell@deloitte.com)

Tazro Ohta (t.ohta@dbcls.rois.ac.jp)

7 of 8

ElasticBLAST - Accelerating Alignments in the Cloud

ElasticBLAST makes it simpler to run BLAST on the cloud (AWS or GCP).

It is relatively new, and we’d like to make it more accessible and a good place for collaboration.

This is a good project both for python developers as well as non-coders.

We will have a Jupyter notebook that you can use to try out NCBI datasets, then use ElasticBLAST with the results of that work.

Topics:

  • Try out the notebook and help us improve it.
  • Give us your frank and brutal feedback.
  • Improve our documentation
    • What’s obscure or too complex?
    • Add examples

We will be on-line Friday only., so don’t miss us.

Links:

ElasticBLAST Documentation

Github for ElasticBLAST

Project Proposer:

Tom Madden

Stakeholders (if any):

Greg Boratyn

Interested Participants:

YOUR NAME HERE!

8 of 8

Contact with any questions

Email: tschlapp@broadinstitute.org

Slack: Same as above! Thomas Schlapp