AI for biology and drug discovery
February, 2025
Jean-Philippe Vert, PhD
Chief R&D Officer, Owkin
Co-founder and CEO, Bioptimus
My journey
2000 2005 2010 2015 2020
PhD
Academics
Industry
The cancer problem
What is cancer?
All happy families are alike; each unhappy family is unhappy in its own way.
- Leon Tolstoy, Anna Karenina.
All cancers are different
7
AI creates value at every stage of the drug life cycle (ex: Owkin)
Confidential information: do not share without written permission from Owkin Inc.
But it is hard: Biology lacks equations
F = ma
90% of drug clinical developments fail
AI is a game-changer
10
Foundation model
Predictive model
Machine learning model
=
+
Large model�Trained without annotations
Shared across tasks
Small model�Trained with annotations
Task-specific
The technology behind: not equations, but Foundation models
Confidential information: do not share without written permission from Owkin Inc.
LLM (eg, GPT)
Chatbot
Text generation
Copilot
[and many more…]
Large Language Models (LLMs) are Foundation Models that “learn the human language”
Confidential information: do not share without written permission from Bioptimus
Confidential information: do not share without written permission from Bioptimus
Foundation model for biology
AI digital twin
AI diagnostics
AI drug design
[and many more…]
ORGANISMS
TISSUES
CELLS
MOLECULES
Can we similarly “learn the language of biology”?
Confidential information: do not share without written permission from Bioptimus
ORGANISMS
TISSUES
CELLS
MOLECULES
Many Foundation Models have emerged recently to capture various aspects of biology
Ex: Proteins
(cherry-picked examples)
Foundation models : a paradigm shift in AI
Ex: single cell omics
(cherry-picked example)
Confidential information: do not share without written permission from Owkin Inc.
Foundation models : a paradigm shift in AI
Towards an AI Virtual Cell?
Confidential information: do not share without written permission from Owkin Inc.
Ex: Histopathology
Confidential information: do not share without written permission from Owkin Inc.
The first and only CE-marked AI diagnostic that optimizes MSI testing for colorectal cancer through a pre-screening approach with digital H&E slides.
98% NPV1
PDF report
95% sensitivity1
Digital pathology
Workflow agnostic
Testing
Suitable for adults with primary colorectal cancer
Works with surgically resected tissue on digitized H&E slides.
Equivalent when compared to standard testing techniques.
Deployment is feasible across IMS systems, or even a shared directory.
Delivered as a PDF report with intuitive design.
Provides high confidence in MSS-AI predictions
95%
98%
1Svrcek M., Saillard C., Dubois R. Blind validation of MSIntuit, an AI-based pre-screening tool for MSI detection from colorectal cancer H&E slides. Poster presented at: European Society for Medical Oncology (ESMO); May 9th - 13th 2022; Paris France
MSIntuit CRC is CE-IVD marked for diagnostic use in Europe and registered with the MHRA in the United Kingdom. In all other countries, including the United States, the use of the product is limited to research, not for use in diagnostic procedures.
Confidential information: do not share without written permission from Owkin Inc.
Main ideas to train a foundation model for histopathology from (many) images without annotation
Contrastive learning
“This is the same tissue”
Generative learning
“Guess the missing pieces”
And endless combinations…
Confidential information: do not share without written permission from Owkin Inc.
Phikon (2023)
Ranked 1st out of 1327 teams
Confidential information: do not share without written permission from Owkin, Inc.
Jan 2024
5.5k WSI, 40M tiles, 16 cancer types, 86m parameters
Confidential information: do not share without written permission from Owkin Inc.
Can we get better results with more data and a larger model?
Confidential information: do not share without written permission from Owkin, Inc.
Available at https://huggingface.co/bioptimus/H-optimus-0
H-optimus-0 (July 2024), largest open-source FM for pathology
import timm
model = timm.create_model("hf_hub:bioptimus/H-optimus-0”, pretrained=True)
Courtesy of Bioptimus: Rodolphe Jenatton, Felipe Llinares, Zelda Mariet, Charlie Saillard
Confidential information: do not share without written permission from Owkin Inc.
Distillation improves robustness (Jan 2025)
Confidential information: do not share without written permission from Owkin Inc.
Confidential information: do not share without written permission from Bioptimus
Foundation model for biology
ORGANISMS
TISSUES
CELLS
MOLECULES
Next frontier: connect the dots across scales and modalities?
25
Ex: from pathology to transcriptomics
RGB image
$10
20k genes
$10,000
Confidential information: do not share without written permission from Owkin Inc.
Latest FM boost accuracy
Confidential information: do not share without written permission from Owkin Inc.
Spatial
omics
WES
H&E
Bulk RNA-seq
Clinical
data
Single-cell
RNA-seq
High volume data target
More data is coming
Confidential information: do not share without written permission from Owkin Inc.
Micro
Macro
Transformer 1
Transformer 2
Transformer 3
Transformer N
Novel architectures are being explored to capture the complexity of biology…
Stay tuned!
Confidential information: do not share without written permission from Owkin Inc.
Thank you!
Confidential information: do not share without written permission from Owkin Inc.