1 of 25

Inducing topological �features with PH

Python Course on Topological Methods in Data Analysis, Sebastian Damrich, IAL/HCI/IWR, 27.10.2020

2 of 25

PH on point clouds

Garin, A., & Tauzin, G. (2019).

Pun, C. S. et al. (2018).

3 of 25

PH on point clouds

4 of 25

PH on images

Figures from: Garin, A., & Tauzin, G. (2019).

5 of 25

PH on images - example

Figures from Clough et al. (2019)

6 of 25

Forward and Backward pass of PH

Figures adapted from Clough et al. (2019)

forward pass

7 of 25

Forward and Backward pass of PH

Figures adapted from Clough et al. (2019)

forward pass

loss

8 of 25

Forward and Backward pass of PH

Figures adapted from Clough et al. (2019)

forward pass

loss

backward pass

9 of 25

Forward and Backward pass of PH

Figures adapted from Clough et al. (2019)

forward pass

loss

backward pass

10 of 25

Backward pass PH

Figures adapted from Hu, X. et al. (2019).

11 of 25

Backward pass PH

Figures adapted from Hu, X. et al. (2019).

Images (current top, � desired bottom)

Persistence diagrams

Matching of diagrams

12 of 25

Backward pass PH

Identify points of birth / death of features�Update according to matched ground truth feature

Figures adapted from Hu, X. et al. (2019).

13 of 25

Backward pass with Gradient Descent

PH is differential�PH can provide gradient of global, topological feature

Use gradient descent to update data (e.g. induce topological feature in image)�Backpropagate to neural network generating the data

But:�Gradient is sparse�Localisation of feature might be unexpected�Pure topology is considered

14 of 25

Examples: CREMI instance segmentation

Use topological loss to guide the training of NN towards topologically relevant part

��Raw data from https://cremi.org/data/, NN by courtesy of Alberto Bailoni

raw

our NN boundary prediction

before topological loss finetuning

GT boundary

15 of 25

Examples: CREMI instance segmenation

Idea by Hu et al. (2019)

Left to right: Raw, optimisation of boundary prediction in conjunction with topological loss�Figure from Hu et al. (2019)

16 of 25

Toy Example: Restore topological features in image

x’ corrupted image�argminx||x-x’||2 + TopoLoss(x)

Loop is “thin” and non-smooth�Single low pixel inside the loop�but near-perfect persistence diagram

Example adapted from Brüel-Gabrielsson, R. et al. (2019).

corrupted image and �persistence diagram

restored image and persistence diagram

17 of 25

Toy example: Regularise linear model

Y = X * βGT + ε

βOLS = argminβ || Y - X*β ||2

βtop = argminβ || Y - X*β ||2 + TopoLoss(β)

Example from Brüel-Gabrielsson, R. et al. (2019).

18 of 25

References

  1. Brüel-Gabrielsson, R. et al. (2019). A Topology Layer for Machine Learning. arXiv preprint arXiv:1905.12200.
  2. Clough, J. R. et al. (2019). A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology. arXiv preprint arXiv:1910.01877.
  3. Garin, A., & Tauzin, G. (2019). A Topological" Reading" Lesson: Classification of MNIST using TDA. arXiv preprint arXiv:1910.08345.
  4. Hu, X., Fuxin, L., Samaras, D., & Chen, C. (2019). Topology-Preserving Deep Image Segmentation. NeurIPS
  5. Pun, C. S. et al. (2018). Persistent-Homology-based Machine Learning and its Applications--A Survey. arXiv preprint arXiv:1811.00252.

19 of 25

Mapper on scRNAseq data

Project proposal

20 of 25

What is scRNAseq data

scRNAseq stands for “single cell RNA sequencing”

RNA = ½ DNA

Gene expression:� gene: sequence of DNA / RNA responsible for biological behaviour� expression: process of generating a protein from a gene

Measure each cells’s expression strength for many different genes

21 of 25

What does scRNAseq data look like

1 0 0 0 0 34 0 0 2 0 … 4�:�:

22 of 25

What does scRNAseq data really look like?

Dentate Gyrus [1]� - 2999d data� - 24185 points�

2D UMAP

Endocrine pancreas [2]� - 3999d data� - 36351 points

� 2D UMAP

[1] A. Bastidas-Ponce et al. (2019)�[2] H. Hochgerner et al. (2018)

23 of 25

Extracting developmental trajectories

Figure from https://scvelo.readthedocs.io/ depicting Endocrine pancreas dataset

Waddington’s epigenetic landscape�Figure from Sandoval et al. (2014)

24 of 25

Project: Mapper on scRNAseq data

Apply Mapper to get graph representation, �ideally showing developmental trajectories

Need to choose filter function � (e.g. L2 distance of gene expression levels)�Perhaps subsample the dataset

��Visualise the graph overlayed on the 2D UMAP embedding

Figure from Nicolau et al. (2011)

25 of 25

References

  1. A. Bastidas-Ponce et al., “Comprehensive single cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis,” Development, vol. 146, no. 12, p. dev173849, Jun. 2019
  2. H. Hochgerner et al., “Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing”. Nature Neuroscience, 21(2), 290–299. 2018
  3. Nicolau, M., Levine, A. J., & Carlsson, G. (2011). Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proceedings of the National Academy of Sciences, 108(17), 7265-7270.
  4. Sandoval, Santiago, et al. "Criticality in gene networks." (2014)