Anomaly Detection in Dynamic Graphs
guest lecture by Shenyang(Andy) Huang
October 14th 2022
Comp 599: Network Science, Fall 2022
Outline
Comp 599: Network Science
2
Definition of an Anomaly
An anomaly is “an observation that differs so much from other observations as to arouse suspicion that it was generated by a different mechanism."
said Douglas M Hawkins in 1980.
reference:
Douglas M Hawkins.Identification of outliers. Vol. 11. Springer, 1980
Reference: Anomaly Detection: A Survey
In practice, concrete definition can depend on:
Comp 599: Network Science
3
Key components associated with anomaly detection
Reference: Anomaly Detection: A Survey
Comp 599: Network Science
4
Data Types
Categorized by relationships between data points
Comp 599: Network Science
5
Anomalies in Graph
Figure from: :
Akoglu L, Tong H, Koutra D. Graph based
anomaly detection and description: a survey.
Data mining and knowledge discovery. 2015
May 1;29(3):626-88
Comp 599: Network Science
6
Anomalies in Graph
Figure from: :
Akoglu L, Tong H, Koutra D. Graph based
anomaly detection and description: a survey.
Data mining and knowledge discovery. 2015
May 1;29(3):626-88
Comp 599: Network Science
7
Static Graph Anomalies
Definition:
Given a graph 𝐆 = (𝐕,𝐄)
Find the nodes / edges / subgraphs which are rare and different or deviate significantly from the pattern observed in the graph
Consider an anomaly scoring function fₐ(x) ∈ [0,1], x is entity of interest
fₐ(x) → 0, normal
fₐ(x) → 1, abnormal
Comp 599: Network Science
8
Challenges for Static Graph Anomalies
Comp 599: Network Science
9
Outline
Comp 599: Network Science
10
Representing Dynamic Graphs
Comp 599: Network Science
11
Anomaly Detection in Dynamic Graphs
Dynamic graphs can be directed / undirected, weighted, attributed etc. depend on the type of the graph
Comp 599: Network Science
12
Challenges for Dynamic Graph Anomaly Detection
Comp 599: Network Science
13
A General Approach
How to detect anomalous entities in a dynamic graph?
Effectively designing, computing and analyzing an anomaly score fₐ(x)
Comp 599: Network Science
14
Anomalous nodes
Set of nodes which have ‘irregular’ evolution when compared to other nodes
Example applications:
Comp 599: Network Science
15
Anomalous edges
Reference:
Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach (KDD 2019)
Edges which have abnormal structural or weight changes
(or other types of abnormal evolution)
Example applications:
Comp 599: Network Science
16
Anomalous Subgraphs
Finding anomalous evolution for a fixed set of subgraphs
Enumerating all possible subgraph is intractable
Example applications:
Comp 599: Network Science
17
Anomalous Community Evolution
Shrinking community
Splitting community
Comp 599: Network Science
18
Anomalous Snapshots
Identify time points where the underlying graph generative model changes
(change points)
or the overall graph structure undergoes drastic one-time changes
(Events)
Example Applications:
Comp 599: Network Science
19
Families of Approaches
Comp 599: Network Science
20
Outline
Comp 599: Network Science
21
Example Dynamic Networks
Canadian Bill Voting Network
MP - Member of Parliament
University of California, Irvine social network
Comp 599: Network Science
22
Laplacian Anomaly Detection (LAD)
Detects the changes in Canadian Member of Parliament voting pattern. 2013 is identified as an anomalous year due to increase in cross community communication (as Justin Trudeau is elected leader of Liberal Party)
A spectral method for anomalous snapshot detection, presented in KDD 2020
Comp 599: Network Science
23
Key Components of LAD
Using the Laplacian eigenvalues σₜ
Extract norm from a short term and a long term sliding window
Cosine similarity between two vectors
Comp 599: Network Science
24
LAD Methodology
Comp 599: Network Science
25
Properties of Laplacian Eigenvalues
Comp 599: Network Science
26
Laplacian Eigenvalues & Connectivity
L = D - A for a graph G
0 = λ₁ ≤ λ₂ ≤ ... ≤ λₙ
λ₂ ≠ 0 iff the graph is connected
# 0 eigenvalues = # of connected components
Comp 599: Network Science
27
Laplacian Eigenvalues & Geometry of the Graph
some simple graph structures and their Laplacian eigenvalues:
Proof and more details see Chapter 6 in
Spectral and Algebraic Graph Theory by Daniel A. Spielman
Comp 599: Network Science
28
Recall: Stochastic Block Model (SBM)
Create synthetic community structure
Parameters:
Comp 599: Network Science
29
Synthetic Dynamic Graphs
Change point = change in community structure in SBM
Event = sudden increase of cross community connections
Comp 599: Network Science
30
Laplacian Spectrum
US Senate Co-sponsorship Network
UCI Message Network
weighted, directed social network
Publicly Available Data and Code
Paper link: https://dl.acm.org/doi/10.1145/3394486.3403077
Code Repository: https://github.com/shenyangHuang/LAD
All experiment is reproducible
and all dataset is in the repo if interested
Outline
Comp 599: Network Science
34
Multi-view Change Point Detection
Given a multi-view dynamic graph 𝔾 = {𝒢ₜ} where 1 ≤ t ≤ T and
𝒢ₜ = {Gᵣ} and 1 ≤ r ≤ L where Gᵣ = (V, E)
each view is a dynamic graph that describes an overall graph generative model H,
Can we detect time points in time where H undergoes drastic changes?
Key idea: leverage multi-view nature of the data to better recover the underlying generative model
Comp 599: Network Science
35
Multi-view Network Examples
Comp 599: Network Science
36
MultiCPD: a multi-view extension of LAD
New York City Taxi dataset
From Jan 2020 to Apr 2020
Each view is a different type of taxi or for hire vehicle service
03.22 is the start of the New York on Pause program
Comp 599: Network Science
37
How to merge information from multiple views?
Comp 599: Network Science
38
Key Component of MultiCPD
Comp 599: Network Science
39
Algorithm of MultiCPD
Comp 599: Network Science
40
Multi-view Data Improves Performance
Increasing difficulty, only change points
Increasing noise, event and change point
Comp 599: Network Science
41
Increasing Number of Views
Only change points
increasing m
Comp 599: Network Science
42
NYC Taxi Dataset 2020
Comp 599: Network Science
43
Outline
Comp 599: Network Science
44
How to scale to large dynamic networks?
Difficult to scale to large graphs with millions of nodes
Much more scalable
Fast and efficient approximation
Losses information about exact values of eigenvalues in most cases
Comp 599: Network Science
45
What is Density of States (Spectral Density)
Reference: Network Density of States
Computed by an efficient approximation method named
Network Density of States or just DOS
Comp 599: Network Science
46
Spectral Density of different networks
Reference: Network Density of States
Comp 599: Network Science
47
Fast and Attributed Change Detection on Dynamic Graphs with Density of States
SBM model with 10 communities
SBM model with 2 communities
Prior version Reference: Scalable Change Point Detection for Dynamic Graphs
Comp 599: Network Science
48
Change in DOS for BA model
Increase in m, number of edges attached from a new node to an existing node
Comp 599: Network Science
49
Performance Comparison
SPCD Outperforming other methods in synthetic experiments
Comp 599: Network Science
50
Computational Complexity: O(E)
Comp 599: Network Science
51
MAG-History Co-authorship Network
Co-authorship network in the history community
Comp 599: Network Science
52
Attribute Changes in Flight Network
Attributes are the country of the airport (node)
Detects flight route closure for chinese airports at beginning of COVID-19 pandemic, Feb 2020
Comp 599: Network Science
53
Thanks for Listening!
If you have any questions, feel free to reach out!
shenyang.huang@mail.mcgill.ca or on slack
Comp 599: Network Science
54
Tips for Research
Comp 599: Network Science
55
Bonus Topics
Comp 599: Network Science
56
AnomRank
Structural anomaly
ANOMALYS
Weight anomaly
ANOMALYW
Comp 599: Network Science
57
ANOMRANK Overview
Comp 599: Network Science
58
ANOMRANK Algorithm
Pros:
Cons:
Comp 599: Network Science
59
SPOTLIGHT
Comp 599: Network Science
60
How SPOTLIGHT Works
Key: use dictionaries for sublinear memory and fast run time
Assumes nodes don’t disappear
Comp 599: Network Science
61
SPOTLIGHT Sketch Space
Comp 599: Network Science
62
Laplacian Eigenvalue Slides
Comp 599: Network Science
63
Star Graph
What would be the Laplacian eigenvalues ?
Hint: let there be n nodes,
1 hub with degree n,
Other nodes have degree 1
Comp 599: Network Science
64
Spectral Graph Theory: Star Graph
Comp 599: Network Science
65
Spectral Graph Theory: Star Graph
Comp 599: Network Science
66
Spectral Graph Theory: Star Graph
Comp 599: Network Science
67
Fully Connected Graph
What would be the Laplacian eigenvalues of a fully connected graph?
Hint: let there be n nodes, we know that this is the highest possible connectivity
Comp 599: Network Science
68
Spectral Graph Theory: Fully Connected Graph
The eigenvalues would be
Proof and more details see Chapter 6 in
Spectral and Algebraic Graph Theory by Daniel A. Spielman
Comp 599: Network Science
69