1 of 29

AI for biology and drug discovery

February, 2025

Jean-Philippe Vert, PhD

Chief R&D Officer, Owkin

Co-founder and CEO, Bioptimus

2 of 29

My journey

2000 2005 2010 2015 2020

PhD

Academics

Industry

3 of 29

The cancer problem

  • 19 million new cases / year worldwide
  • 10 million deaths / year (1 out of 6)
  • 1/3 males, 1/4 females diagnosed before 74 in “rich” countries
  • 70% of deaths in low- and middle-income countries

4 of 29

What is cancer?

  • A growing mass, that may spread to and disrupt other organs
  • Made of fast-dividing cells that look different from normal cells
  • Because their (epi-)genome is abnormal

5 of 29

All happy families are alike; each unhappy family is unhappy in its own way.

- Leon Tolstoy, Anna Karenina.

All cancers are different

6 of 29

7 of 29

7

AI creates value at every stage of the drug life cycle (ex: Owkin)

Confidential information: do not share without written permission from Owkin Inc.

8 of 29

But it is hard: Biology lacks equations

F = ma

90% of drug clinical developments fail

9 of 29

AI is a game-changer

10 of 29

10

Foundation model

Predictive model

Machine learning model

=

+

Large model�Trained without annotations

Shared across tasks

Small model�Trained with annotations

Task-specific

The technology behind: not equations, but Foundation models

Confidential information: do not share without written permission from Owkin Inc.

11 of 29

LLM (eg, GPT)

Chatbot

Text generation

Copilot

[and many more…]

Large Language Models (LLMs) are Foundation Models that “learn the human language”

Confidential information: do not share without written permission from Bioptimus

12 of 29

Confidential information: do not share without written permission from Bioptimus

Foundation model for biology

AI digital twin

AI diagnostics

AI drug design

[and many more…]

ORGANISMS

TISSUES

CELLS

MOLECULES

Can we similarly “learn the language of biology”?

13 of 29

Confidential information: do not share without written permission from Bioptimus

ORGANISMS

TISSUES

CELLS

MOLECULES

Many Foundation Models have emerged recently to capture various aspects of biology

14 of 29

Ex: Proteins

(cherry-picked examples)

15 of 29

Foundation models : a paradigm shift in AI

Ex: single cell omics

(cherry-picked example)

Confidential information: do not share without written permission from Owkin Inc.

16 of 29

Foundation models : a paradigm shift in AI

Towards an AI Virtual Cell?

Confidential information: do not share without written permission from Owkin Inc.

17 of 29

Ex: Histopathology

Confidential information: do not share without written permission from Owkin Inc.

18 of 29

The first and only CE-marked AI diagnostic that optimizes MSI testing for colorectal cancer through a pre-screening approach with digital H&E slides.

98% NPV1

PDF report

95% sensitivity1

Digital pathology

Workflow agnostic

Testing

Suitable for adults with primary colorectal cancer

Works with surgically resected tissue on digitized H&E slides.

Equivalent when compared to standard testing techniques.

Deployment is feasible across IMS systems, or even a shared directory.

Delivered as a PDF report with intuitive design.

Provides high confidence in MSS-AI predictions

95%

98%

1Svrcek M., Saillard C., Dubois R. Blind validation of MSIntuit, an AI-based pre-screening tool for MSI detection from colorectal cancer H&E slides. Poster presented at: European Society for Medical Oncology (ESMO); May 9th - 13th 2022; Paris France

MSIntuit CRC is CE-IVD marked for diagnostic use in Europe and registered with the MHRA in the United Kingdom. In all other countries, including the United States, the use of the product is limited to research, not for use in diagnostic procedures.

Confidential information: do not share without written permission from Owkin Inc.

19 of 29

Main ideas to train a foundation model for histopathology from (many) images without annotation

Contrastive learning

“This is the same tissue”

Generative learning

“Guess the missing pieces”

And endless combinations…

Confidential information: do not share without written permission from Owkin Inc.

20 of 29

Phikon (2023)

Ranked 1st out of 1327 teams

Confidential information: do not share without written permission from Owkin, Inc.

Jan 2024

5.5k WSI, 40M tiles, 16 cancer types, 86m parameters

Confidential information: do not share without written permission from Owkin Inc.

21 of 29

Can we get better results with more data and a larger model?

Confidential information: do not share without written permission from Owkin, Inc.

22 of 29

H-optimus-0 (July 2024), largest open-source FM for pathology

import timm

model = timm.create_model("hf_hub:bioptimus/H-optimus-0”, pretrained=True)

Courtesy of Bioptimus: Rodolphe Jenatton, Felipe Llinares, Zelda Mariet, Charlie Saillard

  • Trained on proprietary datasets of 600k whole-slide images
  • 1B+ parameters vision transformer
  • Apache 2 open source
  • Beats SOTA on many benchmarks

Confidential information: do not share without written permission from Owkin Inc.

23 of 29

Distillation improves robustness (Jan 2025)

Confidential information: do not share without written permission from Owkin Inc.

24 of 29

Confidential information: do not share without written permission from Bioptimus

Foundation model for biology

ORGANISMS

TISSUES

CELLS

MOLECULES

Next frontier: connect the dots across scales and modalities?

25 of 29

25

Ex: from pathology to transcriptomics

RGB image

$10

20k genes

$10,000

Confidential information: do not share without written permission from Owkin Inc.

26 of 29

Latest FM boost accuracy

Confidential information: do not share without written permission from Owkin Inc.

27 of 29

Spatial

omics

WES

H&E

Bulk RNA-seq

Clinical

data

Single-cell

RNA-seq

      • 7 cancer indications NSCLC, Ovarian, Bladder, Mesothelioma, Glioblastoma, TNBC, DLBCL
  • 7,000 patients

High volume data target

More data is coming

Confidential information: do not share without written permission from Owkin Inc.

28 of 29

Micro

Macro

Transformer 1

Transformer 2

Transformer 3

Transformer N

Novel architectures are being explored to capture the complexity of biology…

Stay tuned!

Confidential information: do not share without written permission from Owkin Inc.

29 of 29

Thank you!

Confidential information: do not share without written permission from Owkin Inc.