1 of 18

Predicting Spatially Resolved Gene Expression via Tissue Morphology using Adaptive Spatial Graph Neural Networks

Tianci Song1,2, Eric Cosatto2, Gaoyuan Wang3,4, Rui Kuang1,

Mark Gerstein3,4, Martin Renqiang Min2 and Jonathan Warrell2,3,4

1Department of Computer Science and Engineering, University of Minnesota

2Machine Learning Department, NEC Laboratories America

3Department of Molecular Biophysics and Biochemistry, Yale University

4Program in Computational Biology and Bioinformatics, Yale University

2 of 18

Spatial Context is Important in Transcriptomics Studies

2

3 of 18

Spatial Context is Important in Transcriptomics Studies (Cont’d)

3

4 of 18

  • ISC methods profile spatial transcriptomics by capturing (in-situ) and sequencing (ex-situ) mRNA with arrayed spots with positional barcodes on tissues [1-2]:.

In-Situ Capturing to Profile Spatial Transcriptomics Data

  • Pros:
    • Readily available commercial options;
    • Transcriptome-wide profiling;
    • Near-cellular spatial resolution.

  • Cons:
    • Prohibitive Cost

High cost associated with ISC methods makes them difficult to use in the clinical practices and large-scale studies (e.g. precision medicine).

4

5 of 18

  • Predicting spatial transcriptomics data from tissue morphology in the staining image could be an affordable alternative to ISC methods.

Spatial Transcriptomics Prediction via Tissue Morphology

Staining image

Image patch

 

spot i

(x, y)

 

Spatial expression matrix

coordinates

x

y

spot i

CNN:

Learning Task:

Predict the expression for each spot

with the corresponding image patch

No spatial

information

leveraged

5

6 of 18

    • Modeling the spatial neighboring relations over the image patches with a graph, then aggregating image features based on the spatial adjacency graph could improve spatial gene expression prediction.
    • Hard-coded spatial relations in the spatial adjacency graph might not present in actual spatial gene expression.

Leveraging Spatial Relations in Spatial Transcriptomics Prediction

Spatial

neighborhood

Default spatial

adjacency graph

Refined spatial

adjacency graph

Adaptively

remove irrelevant

spatial relations

Spatial proximity in

gene expression

Redundant

spatial relations

Tumor Boundary

Tumor Microenvironment

6

7 of 18

Adaptive Spatial Graph Neural Networks (asGNN)

 

staining image

capturing spot

array

image patch

 

Encoder

Parameter distribution updating

 

Updating rules:

 

 

Training score:

 

 

 

Affinity Propagation

Clustering

Removing inter-cluster edges via Adaptive Graph Refinement

 

Linear Layer

Spatial GNN Architecture

 

 

 

GNN Layers

 

embedding

 

 

Linear Meta-feature

Transformation

 

 

 

 

7

8 of 18

  • Objective function and optimization:

Adaptive Spatial Graph Neural Networks (asGNN) (Cont’d)

1. Reconstruction Loss

2. Correlation Regularization

3. Spatial Graph Refinement

4. Message Passing in Graph Neural Networks

5. Smoothing-based Optimization [3, 4]

8

9 of 18

  •  

Experimentation

  • Baselines

ST-Net [5]: CNN model fine-tuned for spatial gene expression prediction;

HisToGene [8]: vision transformer for spatial gene expression prediction;

GTN [9]: graph transformer network on the full spatial adjacency graph;

AP-GTN: graph transformer network on the spatial graph refined by clustering.

Model spatial relations explicitly

No spatial relations leveraged

Model spatial relations implicitly

9

10 of 18

  • Holdout validation: 68 tissue sections stratified based on the corresponding molecular subtypes into training, validation, and test sets, consisting of 38, 15, and 15 sections, respectively.
  • External validation: 24 tissue sections employed to evaluate the generalization.

Prediction Performance on Human Breast Cancer

Method

Spatial Graph

Loss

Morphological Features

Convolutional Features

Holdout

External

Holdout

External

MSE ↓

PCC ↑

MSE ↓

PCC ↑

MSE ↓

PCC ↑

MSE ↓

PCC ↑

ST-Net*

N/A

MSE

0.712

0.065

1.081

0.292

HisToGene*

N/A

MSE

0.723

0.024

1.297

0.204

GTN*

Full

MSE

0.719

0.063

1.246

0.199

0.736

0.065

1.071

0.280

AP-GTN*

Pre-clustered

MSE

0.716

0.051

1.361

0.125

0.733

0.074

0.986

0.235

asGNN*

Adaptive

MSE

0.701

0.069

1.213

0.210

0.705

0.083

0.990

0.288

GTN

Full

MSE+PCC

0.710

0.090

1.240

0.204

0.711

0.101

0.961

0.297

AP-GTN

Pre-clustered

MSE+PCC

0.713

0.073

1.302

0.193

0.721

0.098

0.973

0.242

asGNN

Adaptive

MSE+PCC

0.703

0.103

1.208

0.212

0.696

0.113

0.932

0.312

* denotes the model only optimize MSE loss

w/o correlation

regularization

w/ correlation

regularization

10

11 of 18

Prediction Performance on Human Breast Cancer (Cont’d)

11

12 of 18

  • Different strategies to detect spatial domains by merging either clusters from Affinity Propagation clustering (AP) or connected components from refined spatial graph (CC).

Spatial Domain Detection on Human Breast Cancer

12

13 of 18

Spatial Domain Detection on Human Breast Cancer (Cont’d)

Note that all the singleton clusters are excluded for better visualization

13

14 of 18

asGNN: Prototype Clusters and Enrichment Analysis

k=5

k=10

14

15 of 18

  • Conclusions

    • SOTA performance for spatial gene expression prediction from histological images;
    • Adaptive spatial graph structure, where information is only shared in coherent spatial domains;
    • End-to-end training by using a smoothing-based variational optimization;
    • Spatial domains align well with annotations and can be readily interpreted.

  • Future work

    • Incorporate bulk transcriptomics in asGNN training to improve spatial expression prediction;
    • Explore the potential of asGNN in improving spatial resolution, predicting whole transcriptome expression, and modeling inter-cluster dependencies to refine spatial graph;
    • Investigate the potential of asGNN to identify spatial biomarkers for clinical diagnostics;
    • Apply asGNN to identify de novo tumor-type specific tumor or microenvironment domains.

Conclusions and Future work

15

16 of 18

[1] Asp, Michaela, et al. "Spatially resolved transcriptomes—next generation tools for tissue exploration." Bioessays  (2020).

[2] Moses, Lambda, et al. "Museum of spatial transcriptomics." Nature methods (2022).

[3] Leordeanu, Marius, et al. "Smoothing-based optimization." IEEE Conference on Computer Vision and Pattern Recognition (2008).

[4] Gaoyuan Wang, et al. "A variational graph partitioning approach to modeling protein liquid-liquid phase separation." bioRxiv preprint (2024).

[5] He, Bryan, et al. "Integrating spatial gene expression and breast tumour morphology via deep learning." Nature biomedical engineering (2020).

[6] Alma Andersson, et al. "Spatial deconvolution of her2-positive breast cancer delineates tumor-associated cell type interactions". Nature Communications (2021).

[7] Eric Cosatto, et al. Automated gastric cancer diagnosis on h&e-stained sections; ltraining a classifier on a large scale with multiple instance machine learning. In Medical Imaging 2013: Digital Pathology, SPIE (2013).

[8] Pang, Minxing, et al. "Leveraging information in spatial transcriptomics to predict super-resolution gene expression from histology images in tumors." BioRxiv (2021)

[9] Shi, Yunsheng, et al. "Masked label prediction: Unified message passing model for semi-supervised classification." arXiv (2020).

References

16

17 of 18

NEC Lab America (Machine Learning Group)

University of Minnesota (Kuang Lab)

Acknowledgment

Dr. Jonathan Warrell

Dr. Eric Cosatto

Dr. Martin Renqiang Min

Dr. Rui Kuang

Yale University (Gerstein Lab):

Charles Broadbent

Sharada Sridhar

Yoshitaka Inoue

Ethan Kulman

Dr. Mark Gerstein

Dr. Gaoyuan Wang

Yale School of Medicine

17

18 of 18

This work is supported by NEC Laboratories America and the funding from NSF project IIBR 2042159