1 of 28

Deep Representation Learning for Unsupervised Learning on Graphs

Oct 28th, 2025

BMI/CS 775 Computational Network Biology�Fall 2025

Sushmita Roy

https://bmi775.sites.wisc.edu

2 of 28

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

3 of 28

Overview of algorithms for unsupervised representation learning

  • Shallow, structure only
    • Matrix factorization
    • Random walk based methods: Node2vec, DeepWalk
  • Deep, structure and node features
    • Graph Variational AutoEncoder
    • Deep Graph Infomax

4 of 28

Limitations of shallow methods

  • Unable to use node features/attributes (easily)

  • Are inherently transductive
    • All nodes must be seen during the training phase/embedding phase
  • Therefore, cannot be used in an inductive setting
    • No general function that could make a prediction on a new test node

5 of 28

Unsupervised representation learning on graphs

  •  

task: what we want to reconstruct back

6 of 28

Embedding graphs with node attributes

x1

x2

xd

…..

3

4

5

6

7

1

2

Learn embedding

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

3

4

5

6

7

1

2

node attributes

7 of 28

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

8 of 28

Autoencoder

  • A dimensionality reduction framework which can be used for various downstream tasks e.g. visualization, clustering, denoising etc.
  • Comprises of an
    • “encoder” that takes the input and projects it into a (typically) low dimensional space
    • “decoder” that takes the low dimensional projection and puts into the original high dimensional space

9 of 28

Autoencoder

10 of 28

Autoencoder

embedding

11 of 28

Autoencoder

 

 

encoder

decoder

Input

Output

Adapted from Ava Amini slides; see https://www.youtube.com/watch?v=3G5hWM6jqPk

  • To train an autoencoder we need to minimize a reconstruction loss.
  • Mean squared error is commonly used as the loss function

 

12 of 28

Autoencoder vs Variational Autoencoder

  •  

https://towardsdatascience.com/tutorial-on-variational-graph-auto-encoders-da9333281129

13 of 28

Variational Autoencoder

 

 

Adapted from Ava Amini slides; see https://www.youtube.com/watch?v=3G5hWM6jqPk

 

 

 

 

 

 

encoder

decoder

 

14 of 28

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

15 of 28

Notation

  •  

16 of 28

Graph AutoEncoder (GAE)

 

 

encoder

decoder

logistic, sigmoid

Encoder:

Decoder:

 

3

4

5

6

7

1

2

x1

x2

xd

…..

3

4

5

6

7

1

2

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

 

 

 

17 of 28

Variational Graph Autoencoder (VGAE)

 

 

 

 

 

 

encoder

decoder

3

4

5

6

7

1

2

x1

x2

xd

…..

3

4

5

6

7

1

2

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

 

18 of 28

Encoder for the VGAE

  •  

 

 

 

 

a two layer GCN

19 of 28

Decoder for the VGAE

  •  

20 of 28

Loss function

  • Recall the general VAE loss:

  • Loss of VGAE:

 

Where,

21 of 28

Evaluation of GAE and VGAE

  • Experiments done for link prediction for several datasets
  • Models trained on incomplete data where some edges were hidden/removed
  • Assessed via AUC, and Average Precision (AP)

22 of 28

Evaluation of GAE and VGAE

SC: Spectral clustering

DW: DeepWalk

GAE*, VGAE*: Variants with no features

23 of 28

VGAE’s embedding can reveal clustering structure

Latent space of unsupervised VGAE model trained on Cora citation network dataset. Grey lines denote citation links. Colors denote document class (not provided during training).

24 of 28

Overview of GNN methods for clustering

Su, X. et al. A Comprehensive Survey on Community Detection with Deep Learning. IEEE Trans. Neural Netw. Learning Syst. 1–21 (2022) doi:10.1109/TNNLS.2021.3137396.

25 of 28

Overview of deep methods for graph clustering

  • Methods can be grouped into two broad categories
  • First generation of methods
    • Use graph structure only and do node embedding
      • DNR, DGNR, GraphEncoder
  • Second generation of methods
    • Use graph structure and node attributes and do node embedding
      • Independent of cluster information: MGAE, SDGN
      • Incorporate cluster structure: Senet, SGCN

26 of 28

The general idea of methods for graphs with node attributes

x1

x2

xd

…..

3

4

5

6

7

1

2

3

4

5

6

7

1

2

Learn embedding

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

kmeans clustering

3

4

5

6

7

1

2

node attributes

Note 1: Most methods aim to do this in an unsupervised manner because we assume we don’t have labels.

Note 2: The node embedding could be further influenced by the “clustering” structure

27 of 28

Does deep node embedding help for biological network module detection?

Song Z, Baur B and Roy S. Benchmarking graph representation learning algorithms for detecting modules in molecular networks [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:941 (https://doi.org/10.12688/f1000research.134526.1)

28 of 28

References