1 of 15

Modeling microbiome-trait associations with taxonomy-adaptive neural networks

Yifan Jiang

University of Waterloo

2 of 15

Two missions in microbiome research

2

Predict disease status from microbiome profiles

Identify critical microbes associated with diseases

3 of 15

Challenges in modeling microbiome-disease associations

3

  • High feature dimensionality
  • Low sample size
  • Impede the applicability of powerful ML models
  • Uncertainty in read mapping

Potential

profiling inaccuracies

Curse of

dimensionality

Unbalanced

dataset size

4 of 15

An ideal computational model for microbiome-disease associations

4

How can we achieve such a model?

Predictability:

accurate prediction

Interpretability: discover new biology

Trainability:

applicable to small datasets

5 of 15

Modelling dilemma: �predictability vs. trainability vs. interpretability

5

  • Linear model
  • Random forest
  • MIOSTONE
  • Multi-layer Perceptron

Predictability

Trainability

Interpretability

6 of 15

We can mitigate the modeling dilemma by leveraging the inherent correlation structure among taxa

6

Train

Predict

Input

Taxonomy

Output

MIOSTONE

7 of 15

MIOSTONE addresses the challenges in modeling microbiome-disease associations

7

Data-driven taxonomic aggregation

Potential

profiling inaccuracies

Curse of

dimensionality

Unbalanced

dataset size

8 of 15

MIOSTONE provides accurate predictions of the disease status

8

Data:

Task:

  • Crohn's disease (CD) and ulcerative colitis (UC)
  • AUPRC / AUROC

Evaluation:

  • IBD (inflammatory bowel disease)
  • Disease status prediction
  • 174 samples / 5287 taxa
  • 5-fold cross-validation

AUPRC

AUROC

RF

SVM

MLP

Popphy-CNN

TaxoNN

MIOSTONE

9 of 15

MIOSTONE improves predictive performance in sample-limited tasks through knowledge transfer

9

Data:

  • Direct prediction on IBD (Zero-shot)

Task:

  • A larger dataset for UC/CD
  • AUPRC / AUROC

Evaluation:

  • HMP2 (Human Microbiome Project)
  • IBD
  • Fine-tuning on IBD
  • Training on IBD from scratch

AUPRC

AUROC

Zero-shot

Training from scratch

Fine-tuning

10 of 15

MIOSTONE identifies microbiome-disease associations with high interpretability

10

Data:

Task:

  • Literature evidence

Evaluation:

  • IBD
  • Biomarker discovery
  • The Prevotella genus (Kabeerdoss et al, Indian Journal of Medical research, 2015 )

11 of 15

Conclusion

MIOSTONE serves as an effective predictive model, as it not only accurately predicts microbiome-trait

associations across extensive real datasets but also offers interpretability for scientific discovery

11

  • Facilitating in silico investigations into the biological mechanisms underlying microbiome-trait associations

Impact:

Key idea:

  • Reducing feature noises by aggregating taxa in a data driven way
  • Mitigating curse of dimensionality by a taxonomy-encoded architecture
  • Applicable to small datasets via knowledge transfer

12 of 15

Questions?

13 of 15

13

14 of 15

14

15 of 15

15