1 of 47

Dynamic Graph clustering

Nov 11th 2021

BMI 826-23 Computational Network Biology�Fall 2021

Sushmita Roy

https://compnetbiocourse.discovery.wisc.edu

2 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

3 of 47

RECAP: Network dynamics

  • What does modeling “dynamics” mean?
    • The activity of nodes change over time and we want to model how this happens
    • The network (structure or parameters) changes with time
      • Structure can change due to changes at the node or edge level
  • What models can be used for capturing dynamics in networks?
  • How do these models capture dynamics?
    • Node level
    • Edge level

4 of 47

Dynamic modules

Sun et al., Entropy 2020

5 of 47

How might network modules change over time?

Adding nodes

Deleting nodes

Adding edges

Deleting edges

Sun et al., Entropy 2020

6 of 47

Notation

  • T: denotes time points
  • G1.. GT denotes a set of graphs across time
  • A1..AT denotes the set of adjacency matrices across time
  • C1.. CT denotes the set of communities

7 of 47

Problem definition

  • Given
    • G1.. GT representing graphs across time, where Gt denotes the graph at the tth time point
  • Do
    • Identify the communities, C1.. CT for each time point
    • (optionally) identify how communities change over time

8 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

9 of 47

Different ways of dynamic module identification

  • Modularity based
    • DynaMo, Zhuang et al., 2019
    • See review https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516902
  • Spectral clustering and smoothing
    • PisCES (Liu et al., 2017)
    • Spectral fusion (Huang et al., 2018)
  • Dynamic Stochastic block models
    • DSBM
    • HSBM
    • Extended Kalman Filtering
  • Dynamic topic models

10 of 47

Properties of different approaches

  • Probabilistic or non-probabilistic
  • Do the number of modules remain fixed or change over time?
  • Do we model module transitions/evolution across time?
  • Do nodes stay the same or change over time?

11 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

12 of 47

DynaMo: Dynamic Community Detection by�Incrementally Maximizing Modularity

  • An incremental algorithm for maximizing the community structure modularity gain based on the incremental changes of a dynamic network
  • Assumes the network is evolving by a series of incremental edge manipulations
    • Edge addition/removal with and between communities
    • Addition/removal of nodes

13 of 47

DynaMo overview

  • Dynamo is based on applying the Louvain algorithm at each time point
  • Initialize the modules at t=0
  • Given modules at t, update the modules at t+1 by applying the Louvain algorithm to increase modularity

14 of 47

Dynamo overview

  •  

15 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

16 of 47

PisCES: Global spectral clustering in dynamic networks

  • Based on spectral clustering and eigen vector smoothing framework
  • The number of clusters per time point could vary

17 of 47

PisCES notation

  • n: Number of nodes in the graph
  • Vt : n X K denote the eigen vectors of the graph Laplacian at time t
  • Ut=Vt*Vt’ denotes a reconstructed graph of nodes

18 of 47

PisCES algorithm overview

  • Aims to smooth the projected node similarity across time
  • Identify eigen vectors for each smoothed graph
  • Apply kmeans to resulting eigen vectors

19 of 47

PisCES algorithm

  • PisCES objective:

  • Can be reformulated as:

20 of 47

Identification of temporal modules in macaque brain co-expression networks

21 of 47

Zooming into a pair of time points

22 of 47

A gene subnetwork exhibiting substantial rewiring

23 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

24 of 47

Stochastic block model (SBM)

  •  

Park and Bader BMC Bioinformatics 2011

25 of 47

Probability of a grouping structure from the stochastic block model

eij: Number of edges between groups i and j

tij: Number of possible edges between groups i and j

hij: Number of holes between groups i and j

hij = tij - eij

Probability of edge and hole patterns between group i and j

Groups

Probability of edge-hole pattern (M) for all groups

Park and Bader BMC Bioinformatics 2011

26 of 47

Stochastic block model for flat clustering for K=3 groups

Group 3 (7 vertices)

Group 1 (7 vertices)

Group 2 (6 vertices)

27 of 47

Stochastic block model for flat clustering for K=3 groups

Group 3

Group 1

Group 2

28 of 47

Maximum likelihood Parameter estimation

  • Suppose we know what the cluster memberships are
  • Parameters of SBM are θij
  • Maximum likelihood estimate of parameters is

  • The probability of edges/holes between groups i and j

  • can be re-written as

Hence, everything can be expressed in terms of the edge presence and absence pattern

29 of 47

Dynamic stochastic block models

  • SBMs are meant for static networks
  • Dynamic SBMs extend SBMs to model dynamic membership and evolution (may be not all)
    • DSBM. Yang et al 2011
    • Extended Kalman Filter. Xu et al., 2014
      • Gaussian approximation of the probabilities
    • Hierarchical Stochastic block model. Amini et al 2021

30 of 47

Dynamic Stochastic Block Model

  • Extends SBMs to handle dynamically evolving modules over time
  • Fully Bayesian approach
  • Has online/offline version
  • Fully Bayesian framework
  • Allows nodes to join or leave over time

Yang et al, Machine learning 2011

31 of 47

DSBM generative model

How the modules change

How the edges get generated

32 of 47

DSBM generative model

33 of 47

DSBM Likelihood

Graphs at each time point

Module membership at each time point

34 of 47

DSBM Likelihood

Emission probability

Transition probability

35 of 47

DSBM learning

  • Expectation maximization for a maximum likelihood estimate

  • E-step
    • Estimate the hidden module assignments
  • M-step
    • Estimate the model parameters:
      • Transition probabilities
      • Edge presence/absence probabilities

36 of 47

Simulated block structure at different noise levels

37 of 47

Comparing DSBM to other algorithms

38 of 47

Extended Kalman Filters

  • Dynamic stochastic blockmodels for time-evolving social networks
  • Works with the logit of the module interaction probabilities
  • Approximates the state and emissions with Gaussians

Xu et al, 2014

Module assignments

39 of 47

Hierarchical Stochastic Block model

  • A general approach based on SBMs to handle “multi-type” networks
  • Each type could be a time point
  • Similarity of module memberships between time points modeled via prior distributions

Amini et al, 2021

Adjacency matrices

Module interactions

Module assignment

40 of 47

HSBM compared to other algorithms

Results on simulated networks with different transition probabilities

41 of 47

Goals for this lecture

  • Network dynamics and dynamic modules on networks
  • Classes of algorithms for identifying dynamic modules
  • Modularity based algorithms
  • Spectral clustering and smoothing
  • Dynamic Stochastic block models
  • Dynamic topic models

42 of 47

Topic models

  • Latent Dirichlet allocation (LDA) is used for topic modeling of documents
  • Given a set of D documents and assuming K is the number of topics, LDA gives a generative model for the D documents

43 of 47

LDA generative model

  •  

44 of 47

LDA applied to a graph (LDA-G)

  •  

Keith Henderson and Tina Eliassi-Rad. Applying latent Dirichlet allocation to group discovery

in large graphs. (ACM-SAC), 2009.

45 of 47

Dynamic topic models

  • Blei & Lafferty, 2006

 

 

46 of 47

Conclusions

  • Many social and biological networks are dynamic
  • Dynamic module identification requires us to infer the network modules at each time point
  • Algorithms differ based on
    • Optimization framework
    • Allowing variable number of communities
    • Explicit model of community dynamics

47 of 47

References

  • General review
    • Sun Z, Sheng J, Wang B, Ullah A, Khawaja F. Identifying Communities in Dynamic Networks Using Information Dynamics. Entropy (Basel). 2020;22(4):E425. doi:10.3390/e22040425
  • Spectral smoothing:
    • Liu F, Choi D, Xie L, Roeder K. Global spectral clustering in dynamic networks. Proc Natl Acad Sci USA. 2018;115(5):927-932. doi:10.1073/pnas.1718449115
    • Huang Q, Zhao C, Zhang X, Yi D. Community discovering in temporal network with spectral fusion. Chaos. 2019;29(4):043122. doi:10.1063/1.5086769
  • Stochastic block models
    • Amini AA, Paez MS, Lin L. Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks. arXiv:190405330 [cs, stat]. Published online August 11, 2021. Accessed November 11, 2021. http://arxiv.org/abs/1904.05330
    • Xu KS, Hero III AO. Dynamic stochastic blockmodels for time-evolving social networks. IEEE J Sel Top Signal Process. 2014;8(4):552-562. doi:10.1109/JSTSP.2014.2310294
    • Yang T, Chi Y, Zhu S, Gong Y, Jin R. Detecting communities and their evolutions in dynamic social networks—a Bayesian approach. Mach Learn. 2011;82(2):157-189. doi:10.1007/s10994-010-5214-7
  • Topic models
    • Blei DM, Lafferty JD. Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning  - ICML ’06. ACM Press; 2006:113-120. doi:10.1145/1143844.1143859
    • Keith Henderson and Tina Eliassi-Rad. Applying latent Dirichlet allocation to group discovery in large graphs. (ACM-SAC), 2009. https://dl.acm.org/doi/10.1145/1529282.1529607