1 of 20

Unsupervised Learning of Tumor Organoid Morphology and Drug Response

1

Ahmad Tariq

Unsupervised Machine Learning in Health

2 of 20

Why Do Some Tumors Survive Chemotherapy?

Prevalence:

Cancer remains one of the leading causes of death worldwide

2

Core Problem:

Tumor Organoids show high morphological heterogeneity

→ Drug Response varies unpredictably

[1] Wang et al., Journal/PMC, 2021

“Drug resistance and the [...] ineffectiveness of the drug treatment are responsible for up to 90% of the cancer related deaths” [1]

3 of 20

Motivation

Can UL reveal morphological patterns in organoids that may relate to treatment response?

3

Research Question:

Goals:

Connect morphology → Clinical outcome

Discover hidden structure

Learn representations without labels

4 of 20

Dataset

Patient-derived tumor organoids

Limitations:

Small dataset (582 images)

Class imbalance risk

No ground truth

4

Example Images

5 of 20

PCA Fails to Capture Morphological Structure

50 components (75.5% variance explained)
Best k = 2
Low Silhouette Score = 0.16

5

Weak visual separation across clusters (k=2)

PCA Limitations

Linear method fails to capture complex morphology
High variance ≠ imply meaningful biological structure

6 of 20

6

Limitations of K-means Clustering

Requires choosing k in advance
Assumes well-separated clusters
Sensitive to random initialization
Struggles with overlapping structure

7 of 20

Motivation for Nonlinear Representation Learning (VAE)

7

Variational Autoencoder (VAE)

Learns nonlinear latent representation
Maps image → structured

latent space

Preserves subtle features (transparency and density)

Limitations:

Latent space is not directly

interpretable

Learns to reconstruct images, not direct clustering
May not reflect true biological structure

�

8 of 20

Methodology pipeline

8

Microscopy

Images:

(Brightfield organoid

images)

Preprocessing:

Feature Learning

(VAE):

Clustering:

Visualize and

Interpret:

Bounding box

cropping

Grayscale

normalization

Encode images to low dimensional latent space �
Learn nonlinear morphological features

K means find best k based off silhouette score

t-SNE/UMAP of latent space

9 of 20

Why Cropping is Necessary

9

Raw Images

Multiple organoids
Background noise (lighting, debris)

Cropping

Single organoid
Standardizes inputs

Objective

Focuses on morphology
Improves feature learning

Allows for better clustering

train_copy: 702 crops saved

valid_copy: 32 crops saved

test_copy: 23 crops saved

Total combined crops: 757

10 of 20

Standard VAE Results

Learns nonlinear latent structure → strong clustering

Metrics

Latent dimension: 16
Best k: 3
Silhouette score: 0.61

Insights

Clear nonlinear separation
Distinct clusters formed
Minor overlap remains

10

~4× better clustering than PCA (0.61 vs 0.16)

Nonlinear features learned via encoder–decoder + KL regularization

11 of 20

Standard VAE Visualization

11

Weak separation w/ significant overlap

Continuous structure w/ weak cluster boundaries

12 of 20

12

Cluster example photos (Standard VAE)

Cluster 0 (top left): Transparent, low-density organoids

Cluster 1 (middle): Mixed transparency, intermediate density

Cluster 2 (top right): Solid, high-density organoids

13 of 20

Alternative: Why β-VAE?

Reason

Standard VAE balances:

Reconstructing the image
Latent regularization

Reweighted balance:

Stronger constraint on Latent Space:

β x KL divergence

Removes unnecessary variation

Limitations:

May over-compress the latent space

13

14 of 20

Beta VAE Results

Learns nonlinear latent structure → strong clustering

Metrics

Latent dimension: 8
Best k: 3
Silhouette score: 0.27

Insights

Stronger KL constraint → reduced latent space
Forces model to keep only important variation
Clearer and more interpretable latent space

14

Trade-off: reconstruction → structure (0.61 → 0.27)

β controls information bottleneck → larger β → simpler latent structure

15 of 20

Beta VAE Visualization

15

Partial separation with noticeable overlap

Organized but not strongly separated

t-SNE

UMAP

16 of 20

16

Cluster example photos from Beta VAE

Cluster 0 (top left): Transparent, low-density organoids

Cluster 1 (middle): Mixed transparency, intermediate density

Cluster 2 (top right): Solid, high-density organoids

17 of 20

Conclusion: Clinical Drug Response

17

Transparent

Low-Density Organoids

Most drug resistant
Highest viability under treatment

Partial Transparent

Intermediate Density

Moderate response to treatment

Solid

High-Density Organoids

Lowest viability
Most sensitive to treatment

Summary: Morphology transparency correlates directly with increased drug resistance in tumor models

18 of 20

Prior Work

Transparent organoids exhibit the highest viability, solid the lowest, with intermediate in between.

18

Drug response (5-FU) vs viability

Aligns with our β-VAE clusters linking morphology to drug response

19 of 20

Q&A

How do we know clusters are actually meaningful biologically and why isn’t UL widely used in clinical settings?

Why didn’t we get stronger separation between clusters?

Does strong cluster separation really matter if the clusters are still biologically meaningful

19

20 of 20

Citations

https://pmc.ncbi.nlm.nih.gov/articles/PMC8315569/

https://arxiv.org/abs/1707.01700

https://docs.voxel51.com/tutorials/clustering.html

https://arxiv.org/abs/1807.05520

https://github.com/facebookresearch/deepcluster/blob/main/main.py

https://pmc.ncbi.nlm.nih.gov/articles/PMC6677277/

https://github.com/schaugf/ImageVAE/blob/master/image_vae.py

https://www.mdpi.com/2313-433X/6/5/29

https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html

https://www.geeksforgeeks.org/deep-learning/pytorch-for-unsupervised-clustering/

20