1 of 27

Deep Representation Learning for Unsupervised Learning on Graphs

Nov 2nd, 2023

BMI/CS 775 Computational Network Biology�Fall 2023

Sushmita Roy

https://compnetbiocourse.discovery.wisc.edu

2 of 27

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

3 of 27

Overview of algorithms for unsupervised representation learning

  • Shallow, structure only
    • Matrix factorization
    • Random walk based methods: Node2vec, DeepWalk
  • Deep, structure and node features
    • Graph Variational AutoEncoder
    • Deep Graph Infomax

4 of 27

Limitations of shallow methods

  • Unable to use node features/attributes (easily)

  • Are inherently transductive
    • All nodes must be seen during the training phase/embedding phase
  • Therefore, cannot be used in an inductive setting
    • No general function that could make a prediction on a new test node

5 of 27

Unsupervised representation learning on graphs

  •  

task: what we want to reconstruct back

6 of 27

Embedding graphs with node attributes

x1

x2

xd

…..

3

4

5

6

7

1

2

Learn embedding

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

3

4

5

6

7

1

2

node attributes

7 of 27

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

8 of 27

Autoencoder

  • A dimensionality reduction framework which can be used for various downstream tasks e.g. visualization, clustering, denoising etc.
  • Comprises of an
    • “encoder” that takes the input and projects it into a (typically) low dimensional space
    • “decoder” that takes the low dimensional projection and puts into the original high dimensional space

9 of 27

Autoencoder

10 of 27

Autoencoder

embedding

11 of 27

Autoencoder

 

 

encoder

decoder

Input

Output

Adapted from Ava Amini slides; see https://www.youtube.com/watch?v=3G5hWM6jqPk

embedding

  • To train an autoencoder we need to minimize a reconstruction loss.
  • Mean squared error is commonly used as the loss function

 

12 of 27

Autoencoder vs Variational Autoencoder

  •  

https://towardsdatascience.com/tutorial-on-variational-graph-auto-encoders-da9333281129

13 of 27

Variational Autoencoder

 

 

Adapted from Ava Amini slides; see https://www.youtube.com/watch?v=3G5hWM6jqPk

 

 

 

 

 

 

encoder

decoder

 

14 of 27

Goals for today

  • Motivation for deep representation learning
  • Deep graph representation learning
    • Autoencoder
    • Graph AutoEncoder and Graph Variational AutoEncoder

15 of 27

Notation

  •  

16 of 27

Graph AutoEncoder (GAE)

 

 

encoder

decoder

logistic, sigmoid

Encoder:

Decoder:

 

3

4

5

6

7

1

2

x1

x2

xd

…..

3

4

5

6

7

1

2

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

A: Adjacency matrix

X: nXF features

17 of 27

Variational Graph Autoencoder (VGAE)

 

 

 

 

 

 

encoder

decoder

3

4

5

6

7

1

2

x1

x2

xd

…..

3

4

5

6

7

1

2

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

 

18 of 27

Encoder for the VGAE

  •  

 

 

 

 

a two layer GCN

19 of 27

Decoder for the VGAE

  •  

20 of 27

Evaluation of GAE and VGAE

  • Experiments done for link prediction for several datasets
  • Models trained on incomplete data where some edges were hidden/removed
  • Assessed via AUC, and Average Precision (AP)

21 of 27

Evaluation of GAE and VGAE

SC: Spectral clustering

DW: DeepWalk

GAE*, VGAE*: Variants with no features

22 of 27

VGAE’s embedding can reveal clustering structure

Latent space of unsupervised VGAE model trained on Cora citation network dataset. Grey lines denote citation links. Colors denote document class (not provided during training).

23 of 27

Overview of GNN methods for clustering

Su, X. et al. A Comprehensive Survey on Community Detection with Deep Learning. IEEE Trans. Neural Netw. Learning Syst. 1–21 (2022) doi:10.1109/TNNLS.2021.3137396.

24 of 27

Overview of deep methods for graph clustering

  • Methods can be grouped into two broad categories
  • First generation of methods
    • Use graph structure only and do node embedding
      • DNR, DGNR, GraphEncoder
  • Second generation of methods
    • Use graph structure and node attributes and do node embedding
      • Independent of cluster information: MGAE, SDGN
      • Incorporate cluster structure: Senet, SGCN

25 of 27

The general idea of methods for graphs with node attributes

x1

x2

xd

…..

3

4

5

6

7

1

2

3

4

5

6

7

1

2

Learn embedding

x11

x12

x1d

…..

x1

x2

xd

…..

x1

x2

xd

…..

x1

x2

xd

…..

x51

x52

x5d

…..

x1

x2

xd

…..

x1

x4

x3

x2

x7

x5

x6

kmeans clustering

3

4

5

6

7

1

2

node attributes

Note 1: Most methods aim to do this in an unsupervised manner because we assume we don’t have labels.

Note 2: The node embedding could be further influenced by the “clustering” structure

26 of 27

Does deep node embedding help for biological network module detection?

Song Z, Baur B and Roy S. Benchmarking graph representation learning algorithms for detecting modules in molecular networks [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:941 (https://doi.org/10.12688/f1000research.134526.1)

27 of 27

References