1 of 21

AI foundation models in biology

November, 2025

Jean-Philippe Vert, PhD

Co-founder and CEO, Bioptimus

2 of 21

2

AI is widely used in drug discovery and development

3 of 21

But…

  • Long and expensive:
    • 12-15 years from discovery to approval
    • >$2b per drug

  • Error prone:
    • 90% failure rate in clinical trials

  • Massive unmet medical needs remain
    • 10 million people die of cancer each year (>1m in the EU)
    • 95% of “rare diseases” have no treatment option

  • Societal and economic burden
    • ~17% of GDP devoted to healthcare in the US, ~10% in the EU

4 of 21

Why?

F = ma

Engineered simplicity

Biology

Evolutionary tinkering

5 of 21

Hope: “Modern AI” can be a game-changer

6 of 21

6

Foundation model

Predictive model

Machine learning model

=

+

Large model�Trained without annotations

Shared across tasks

Small model�Trained with annotations

Task-specific

The technology behind: no equation, but Foundation models

7 of 21

LLM (eg, GPT)

Chatbot

Text generation

Reasoning

[and many more…]

Large Language Models (LLMs) are Foundation Models that “learn the human language”

Confidential information: do not share without written permission from Bioptimus

8 of 21

Confidential information: do not share without written permission from Bioptimus

Foundation model for biology

AI digital twin

AI diagnostics

AI drug design

[and many more…]

ORGANISMS

TISSUES

CELLS

MOLECULES

Can we similarly “learn the language of biology”?

9 of 21

Confidential information: do not share without written permission from Bioptimus

ORGANISMS

TISSUES

CELLS

MOLECULES

Many Foundation Models have emerged recently to capture various aspects of biology

10 of 21

Ex: Histopathology

11 of 21

Available at https://huggingface.co/bioptimus/H-optimus-1 (~1m downloads)

H-optimus (2025)

import timm

model = timm.create_model("hf_hub:bioptimus/H-optimus-1”, pretrained=True)

  • Trained on proprietary datasets of 2B+ images
    • 1M+ whole-slide images
    • 800k+ patients
    • 3 scanner types
    • 50+ organ tissues
    • 4k+ clinical practices
  • 1.1B parameters vision transformer
  • Free for academic research

12 of 21

More/better data + bigger models = better performance

13 of 21

Performance on PathBench, a large-scale public benchmark (Oct 25)

Average rank among 22 methods on 229 tasks (41k+ cases, 26 hospitals, 12 organs)

14 of 21

Histology foundation models fuel AI diagnostics

15 of 21

From images to spatial omics

RGB image

$10

20k genes

$10,000

16 of 21

H-optimus-1 enables in silico prediction of spatial gene expression

17 of 21

Confidential information: do not share without written permission from Bioptimus

Multimodal, Multi-scale foundation model

ORGANISMS

TISSUES

CELLS

MOLECULES

Next frontier 1:

Connect the dots across scales and modalities

Data? Tokenizers? Architectures? Training? Evaluation?

18 of 21

Next frontier 2:

Causality and response to perturbation

From AI virtual cells

to AI virtual patients?

19 of 21

Next frontier 3:

From foundation to biological reasoning

AI scientist? Lab-in-the-loop?

https://chanzuckerberg.com/blog/rbio-reasoning-ai-model/

20 of 21

Can Europe compete in AI for life science?

  • No choice. We have to.

  • ✅ Research institutions, hospitals, density of talents in Bio+Tech, startups, large pharma/biotech

  • ❌ Compute infrastructures, public and private investments, regulation

  • Opportunity in data generation, and in training across disciplines

21 of 21

Thank you!

jp@bioptimus.com