1 of 22

Planning Slide

  • Galaxy’s past projects and goals (30 seconds)
  • Why care about soil (45 seconds)
  • What is GDSCN and BioDIGS (90 seconds)
  • Galaxy Workflows and Tutorials (150 seconds)
    • Move on to how it has already been used by students
  • More Project info (60 seconds)
  • Early results (180 seconds)
  • Summary (30 seconds)

2 of 22

BioDIGS: A GDSCN Project

BioDiversity and Informatics for Genomics Scholars

Tyler Collins, Alex Ostrovsky, Michael Schatz, The GDSCN community

June 24, 2024

3 of 22

Galaxy and large Genomics Projects Helping People

Covid-19

Large-scale genome assembly

Cloud computation enhancements

Workflow enhancements

4 of 22

Galaxy and large Genomics Projects Helping People

Covid-19

Large-scale genome assembly

Cloud computation enhancements

Workflow enhancements

5 of 22

Genomic Data Science Community Network

Promoting education and research in genomic data science at HBCUs, HSIs, TCUs, CCs, and other underserved institutions

6 of 22

Genomic Diversity

7 of 22

Genomic Diversity

Enumerating soil biodiversity

Anthony et al (2023) PNAS.

doi:10.1073/pnas.2304663120

soil is likely home to 59% of life making it the singular most biodiverse habitat on Earth.

8 of 22

BioDIGS: What can we learn from the soil?

    • Microbiology & Metagenomics
      • What’s there? How does the genomic composition change in time and space?
    • Genomics & Bioinformatics
      • Optimal approaches for metagenome assembly and classification? Merits of short- vs long-read sequencing?
    • Agriculture & Environment
      • How do characteristics of the soil & soil microbiome modulate plant & animal development?
    • Public Health
      • How does the soil microbiome influence the human microbiome & human health outcomes?

biodigs.org

9 of 22

GDSCN Supports the Research Community

  • For students:
    • Galaxy trainings and support
    • Computational and training resources regardless of student or school resources
    • Possibility of publications

10 of 22

GDSCN Supports the Research Community

  • For professors:
    • Professor-designed lesson plans with specific deliverables
    • Access to Galaxy community for help
    • Able to avoid student program dependency across different operating systems

  • For students:
    • Galaxy trainings and support
    • Computational and training resources regardless of student or school resources
    • Possibility of publications

11 of 22

GDSCN Supports the Research Community

  • For professors:
    • Professor-designed lesson plans with specific deliverables
    • Access to Galaxy community for help
    • Able to avoid student program dependency across different operating systems

  • For students:
    • Galaxy trainings and support
    • Computational and training resources regardless of student or school resources
    • Possibility of publications
  • For Galaxy
    • New tools and IWC workflows
    • A good starting point for bioinformatics training
    • Publications

12 of 22

BioDIGS Analysis

Genomic

Diversity

Environmental Associations

Human Health

& Disease

13 of 22

DC+Baltimore pilot study sampling

Montgomery Co.

24 samples

(students from MC)

Interactive map at: http://biodigs.org

Stony Run (near JHU)

Lake Needwood

Gwynns Falls Trailhead

Baltimore

24 samples

(students from NDMU, MC, and CSM)

14 of 22

Genomic Diversity

Metagenome profiling and containment estimation through abundance-corrected k-mer sketching with sylph

Shaw and Yu (2023) bioRxiv. doi:10.1101/2023.11.20.567879

Baltimore

Montgomery

15 of 22

Heavy metal associations

Iron

Lead

Arsenic

EU Limit

VT Limit

16 of 22

BioDIGS Analysis

  • New and updated metagenomic tools:
    • Sylph tool suite
    • New Taxonomic Databases
  • New workflows
    • Multiple taxonomic assignment workflows
    • Functional analysis workflows
    • De Novo Assembly and MAG Binning
  • Longreads
    • Improving assembly efficacy with hifi reads
    • MetaFlye

17 of 22

BioDIGS Analysis

Katherine Ulbricht

Nia Davis

Ayalew lab

Spelman College

18 of 22

Summary

  • Exploring the interaction between soil, metagenomic diversity, and human health
    • Discover new genomes and genes
    • Discover new antimicrobial resistance genes, other human metabolites present in the soil
    • Orders of magnitude improvements from long reads
  • Empower the next generation of scientists
    • Teach state-of-the-art genomics & data science
    • Provide open access data & compute for community research
  • Next Steps: Dynamics across space & time!
    • More institutions, longitudinal analysis
    • Exposures, Climate, Health data, etc
    • Training materials and research notes

We need your help!

19 of 22

Get involved with BioDIGS and GDSCN

Monday, June 24th 6:00 pm

Room B

20 of 22

Student Acknowledgements

Montgomery County

Madeline Graham, Daniel Chin, David Soussana

Baltimore

Loraye Smith, Tyler Smith, Madeline Graham

Seattle

Matheus Fernandes, Carl Pontino, Randon Serikawa, Joelle Taganna

Clovis Community College

Malachi Whitford, Grace Freymiller, Domonique Advincula, Troy Burgess, Janet Castillo, Jennifer Elziade, Dorthy Esparza, Nicholas Foreman, Ana Hernandez, Glenda Medina, Christina Munoz, Nicole Potter, Quince Quintana, Nickie Ruiz, Ryan Wilder, Orion York

Meharry Medical College

Lincoln Liburd II, Sydney Jamison, Destiney Ball, Claude Albritton, Arjun Pratap

Virginia State University

Michael Marone

Northern Virginia Community College

Rachel Marie Ametin, Joceph Duncan, Noha Elnawam, Sarah-Leila Kaci

University of Texas - El Paso

Frida Delgadillo, Armando Jimenez, Keyan Ozuna

El Paso Community College

Efren Barragan, Faith Chanhuhwa, Tania Da Silveira, Marco Ferrel, Josh Samuel Ikechi-Konkwo, Olivia Kelly, America Pinela, Ryley Stewart

Spelman College

Natajha Graham, Nia Davis, Katherine Ulbricht

University of Hawaii - Mānoa

Sudhir Kumar Rai, Yujia Qin, Ba Thong Nguyen, Mohammadamin Mahmanzar, Yu Chen, Isam Mohd Ibrahim, Donna Lee Kuehu, Asmita Pandey

21 of 22

Acknowledgements

GDSCN Faculty

Andrew Lee, Northern Virginia Community College

Brandi Kamermans, Northwest Indian College

Edu Suarez Martinez, University of Puerto Rico

Emily Biggane, United Tribes Technical College

James Sniezek, Montgomery College

Joslynn Lee, Fort Lewis College

Kamal Chowdhury, Claflin University

Karla Fuller, Guttman Community College

Loyda B. Méndez, University of Puerto Rico at Aguadilla

Lyle Best, Turtle Mountain Community College

Maria Alvarez, El Paso Community College

Mentewab Ayalew, Spelman College

Michael Campbell, University of Southern California

Michele Nishiguchi, University of California Merced

Miguel P Mendez Gonzalez, University of Puerto Rico

Peter Vos, Montgomery College

Rachel Saidi, Montgomery College

Robert Meller, Morehouse School of Medicine

Rosa Alcazar, Clovis Community College

Shazia Tabassum Hakim, Dine College

Sidd Pratap, Meharry Medical College

Sourav Roy, University of Texas at El Paso

Xianfa Xie, Virginia State University

Youping Deng, University of Hawaii at Manōa

GDSCN Organizers

Johns Hopkins University: Michael Schatz, Lance O’Connor, Matthew Nguyen, Stephen Mosher, Natalie Kucher, Tyler Collins, Alex Ostrovsky; Fred Hutch Cancer Research Center: Jeffrey Leek, Ava Hoffman, Elizabeth Humphries, Carrie Wright; Carnegie Institute: Frederick Tan; National Human Genome Research Institute: Shurjo Sen, Christina Daulton, Valentina Di Francesco, Eric Green

JHU GRCF

David Mohr, Alejandra Gutierrez, Samantha Zimmer, Alan Scott, Kim Doheny

CSHL

W. Richard McCombie, Sara Goodwin, Senem Mavruk Eskipehlivan

PacBio

Mark Van Oene, Kelvin Liu, Jeffrey Burke, Primo Baybayan, Michelle Kim, Dan Portik, Jeremy Wilkinson, Trang Dahlen, Jonas Korlach

CosmosID

Kelly Moffat, Manoj Dadlani, Rita Colwell

Tiny Earth

Jennifer Kerr, Nichole Broderick, Sarah Miller

22 of 22