1 of 36

Topological Data Analysis

Leah Valentiner and Elaine Yang

2 of 36

Topology Intuition

  • Math with (squishy) shapes
  • How’s it different from Geometry?

3 of 36

4 of 36

Topology Intuition (more rigorous…)

  • Topology is the mathematical study of properties of shapes that remain unchanged under continuous deformations
    • things like stretching, bending, or twisting, but not tearing or gluing

  • Key Idea: Objects are topologically equivalent if one can be continuously transformed into the other.

5 of 36

Homology groups

  • Hk(X)

  • Topologically equivalent spaces have the same homology groups!

6 of 36

-> Betti Numbers:

the rank of Homology groups

  • Introduced by Enrico Betti (1871)
  • Formalized by Henri Poincaré (1895)
  • Foundation for modern Computational Topology and Topological Data Analysis (TDA)

Conceptual Bridge:

  • We associate algebraic groups (H₀, H₁, H₂, …) to spaces; these groups capture information about holes of particular dimensions.

7 of 36

Betti Numbers: Computational Topology

  • Computational topology applies algorithms to study and compare shapes based on their topological features — such as connected components, tunnels, and voids — rather than their geometric size or position.
  • Betti Numbers: Number of independent k-dimensional holes (rank of Hₖ).

8 of 36

Hk, Zk and Bk

  • Triangle & Filled-In-Triangle
  • A k-cycle is a k-dimensional object that has no boundary (it's closed).
    • Zk: group of cycles
  • A k-boundary is a k-cycle that is the boundary of a (k+1)-dimensional object in the space.
    • Bk: group of boundaries
  • Homology group: the quotient group Hk=Zk/Bk : a closed structure that exists but cannot be "filled in" by a higher-dimensional structure within the space.

9 of 36

Some Other Examples

10 of 36

Simplicial Complex

Note: simplicial complexes are generalizations of graphs

11 of 36

Whiteboard

“Colloquial” version of definition

Several good examples

One non-example

Point out a cycle and a boundary

12 of 36

Homology Groups and Simplicial Complexes

  1. Write down boundary matrices
  2. Move them into Smith Normal Form
  3. Use the SNF boundary matrices to find Betti numbers

13 of 36

Boundary Matrix

Create a matrix Mk to represent boundary information:

  1. Label each row with a (k-1)-simplex
  2. Label each column with an k-simplex
  3. Put a 1 or -1 in the matrix if the (k-1)-simplex is part of the boundary of the n-simplex: use orientation rule to decide

14 of 36

Orientation rule

15 of 36

Boundary Matrix Example

16 of 36

Smith Normal Form

Similar to row-reduced echelon form

All zeros, except for the upper left block, which is a diagonal matrix

17 of 36

Rules for finding Betti Numbers

k= rank(Zk) - rank(Bk)

cycles - boundaries

rank(Zk) = # of zero columns of SNF(Mk)

columns associated with cycles get zeroed out

rank(Bk) = # of nonzero rows of SNF(Mk+1)

rows associated with boundaries are non-zero

18 of 36

Example

Longer example to walk through the steps for calculating the homology groups of some simple simplicial complex using linear algebra

19 of 36

Break between Monday and Thursday

20 of 36

Review

21 of 36

Triangulation

Break up/transform a surface/shape into triangles/simplices

This way we can use our computational tools

22 of 36

Example

Short examples on the whiteboard (depending on time)

Sphere

Donut

Cylinder

23 of 36

In real life…

  • No simplicial complexes!

24 of 36

Vietoris-Rips Complex

  1. Pick some value ε
  2. Around each point, draw a ball with radius = ε
  3. When some set of points {x1, x2, …, xk} all have pairwise distance between them less than ε, then draw a (k-1)-complex to connect them
  4. This will give you a simplicial complex

Next: we will think about what happens over a range of ε values

25 of 36

Example

Short whiteboard example:

  • Very small epsilon
  • Very large epsilon
  • Something in the middle

Online simulator

https://hosscine.shinyapps.io/rips_complex/

https://www.smajhi.com/tutorials/topology/rips.html

26 of 36

Persistent Homology

  • Basic idea: each time we choose a new epsilon
    • recalculate Betti numbers
    • observe the creation/destruction of topological features
  • Persistence diagrams: show the “birth” and “death” of topological features

Final step: need to interpret “birth” and “death” of holes in a way that provides information about the dataset

27 of 36

Persistence Diagram Board Example?

Maybe

Must be very simple so I can draw it

28 of 36

Persistence Diagrams Example

29 of 36

Working with real data

  • Data points become a point cloud
    • Their locations could be based on actual geographic/physical location
    • Could be plot points according to their features

  • Need a measure of distance
    • Might just use ordinary distance metric
    • Could also use something fancier

30 of 36

Remote sensing technology - Light Detechtion and Ranging

  • Problem: localization accuracy - “loop closure?”

  • Very small: -> a place we have been to / a loop!
  • Large: -> new place / no loop!

-> What are inside these descriptive vectors?

31 of 36

Betti numbers? Birth & Death?

  • Birth–death pairs tell how stable those structures are — how long they persist across scales.
  • Betti: what that is ; Birth-death pairs: how solid / how confident

Betti number

Symbol

Meaning in geometry

Meaning in LiDAR scenes

B0 / H0

Connected components

“How many isolated clusters exist?”

separate buildings, cars, trees

B1 / H1

Loops / holes

“How many circular tunnels or rings?”

roads surrounding squares, bridges, circular layouts

B2 / H2

Voids

“How many enclosed 3D spaces?”

tunnels, underpasses, courtyards, domes

32 of 36

Polling site

https://hosscine.shinyapps.io/rips_complex/ - animation link (again)

  • Researchers use PH to study voting site accessibility across U.S. cities.

33 of 36

Cancer Imaging

TDA to improve the prediction of lung-tumor histology (the tissue type of the tumor) from CT images?

Data:

  • 3D CT tumor volumes (segmented by radiologists).�
  • Each tumor treated as a 3D voxel cloud.�
  • Tasks: benign vs malignant | adenocarcinoma vs squamous | small vs non-small cell.
  • H0, H1, H2?

34 of 36

35 of 36

How TDA helps

What TDA detects:

  • Birth–death pairs reflect how long structural features persist as intensity thresholds change.�
  • Long-lived features = strong, real 3D structures inside the tumor.�

Results:

  • Topological features add complementary information to radiomics.�
  • Improved accuracy for benign vs malignant and adeno vs squamous classification.�
  • Complex, irregular topology → more likely malignant.

36 of 36

Articles