Unlocking
Graph Neural Networks
Giuseppe Futia �CSI Piemonte - Consortium for Information Systems (Italy)
A Hands-on Journey from Basics to Breakthroughs
Layers�Graph Convolutional Networks (GCNs), GraphSAGE, Graph Attention Networks(GATs)
Our journey
Applications
Node Classification (Anti-Money Laundering), Link Prediction (Recommendation systems), LLMs (Chat with Your Graph)
Theory
Message Passing Neural Network Principles, from Theory to Code
https://github.com/giuseppefutia/klab?tab=readme-ov-file#--unlocking-graph-neural-networks
3
4
Input Graph
Structured Features
Learning Algorithm
Prediction
Downstream prediction task
Feature-engineering task
(node, edge, graph-level features)
Traditional ML for Graphs
5
Input Graph
Structured Features
Learning Algorithm
Prediction
Downstream prediction task
Feature-engineering task
(node, edge, graph-level features)
Traditional ML for Graphs
6
Input Graph
Learning Algorithm
Structured Features
Prediction
Representation Learning (automatically learn the features) �for the downstream prediction task
Graph Representation Learning (GRL)
Layers�Graph Convolutional Networks (GCNs), GraphSAGE, Graph Attention Networks(GATs)
Our journey
Applications
Node Classification (Anti-Money Laundering), Link Prediction (Recommendation systems), LLMs (Chat with Your Graph)
Theory
Message Passing Neural Network Principles, from Theory to Code
https://github.com/giuseppefutia/klab#unlocking-graph-neural-networks
9
Message()
Aggregate()
Update()
Message Passing Layer
10
Message()
Aggregate()
Update()
Message Passing Layer
11
Message()
Aggregate()
Update()
Message Passing Layer
12
Message()
Aggregate()
Update()
Message Passing Layer
GOAL: Represent the node features based on its relational structure
13
Message()
Aggregate()
Update()
Message Passing Layer
Initial Node features
14
Message Passing Layer
Neighborhood Features
Message()
Aggregate()
Update()
15
Message()
Aggregate()
Update()
Message Passing Layer
16
Message()
Aggregate()
Update()
Message Passing Layer
Message parametrized through an MLP
17
Parametrize “Something” Through a �Multilayer Perceptron
18
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
One-hot Encoding Representation
19
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
Nodes is a graph…
…or words in a vocabulary
Let’s Multiply with a Parameters Matrix…
20
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
… and Obtain a New Matrix
21
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
22
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
23
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
24
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
25
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
26
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
… and Obtain a New Matrix
27
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
Let’s Analyze the Results…
28
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
Let’s Analyze the Results…
29
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
We Encoded Our Node (or Word) Representation in a New Vector Space
30
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
We Have a Learnable Embedding Representation
31
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
The Final Step - A Non-Linear Transformation
32
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
-0.51 | 0.89 | 0.89 |
0.55 | -0.70 | 0.29 |
0.12 | 0.20 | 0.05 |
Let’s Apply ReLU…
33
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
0 | 0.89 | 0.89 |
0.55 | 0 | 0.29 |
0.12 | 0.20 | 0.05 |
We Parameterized our Input �Features through a MLP!!
34
0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | 0 |
0.55 | -0.70 | 0.29 |
-0.51 | 0.89 | 0.89 |
0.12 | 0.20 | 0.05 |
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
X
=
0.44 | 0.02 | 0.45 |
0.64 | 0.27 | 0.67 |
0 | 0.89 | 0.89 |
0.55 | 0 | 0.29 |
0.12 | 0.20 | 0.05 |
35
Message()
Aggregate()
Update()
Message Passing Layer
Message parametrized through an MLP
36
Message()
Aggregate()
Update()
Message Passing Layer
37
Message()
Aggregate()
Update()
Message Passing Layer
Aggregate using a Permutation Invariant Function
38
39
Message()
Aggregate()
Update()
Message Passing Layer
Aggregate using a Permutation Invariant Function
https://medium.com/ai-advances/the-expressive-power-of-gnns-invariance-and-equivariance-101768971cd9
40
Message()
Aggregate()
Update()
Message Passing Layer
Update h_i concatenating its previous features and the aggregated messages through an MLP
41
Message()
Aggregate()
Update()
Message Passing Layer
42
Message()
Aggregate()
Update()
Message Passing Layer
43
Message()
Aggregate()
Update()
Message()
Aggregate()
Update()
Message Passing Layer(s)
44
Message()
Aggregate()
Update()
Message Passing Layer(s)
Message()
Aggregate()
Update()
45
Message()
Aggregate()
Update()
Message Passing Layer(s)
Message()
Aggregate()
Update()
46
Message()
Aggregate()
Update()
Message()
Aggregate()
Update()
Readout()
Message Passing Layer(s)
Message Passing
Neural
Network
Model
|
|
|
|
Output graph
vector
47
Message()
Aggregate()
Update()
Message()
Aggregate()
Update()
Readout()
Message Passing Layer(s)
Message Passing
Neural
Network
Model
|
|
|
|
Output graph
vector
Aggregate using a Permutation Invariant Function
48
Message()
Aggregate()
Update()
Message Passing Layer in PyTorch Geometric (PyG)
49
Intuitions on the Scatter Operation in PyG
Layers�Graph Convolutional Networks (GCNs), GraphSAGE, Graph Attention Networks(GATs)
Our journey
Applications
Node Classification (Anti-Money Laundering), Link Prediction (Recommendation systems), LLMs (Chat with Your Graph)
Theory
Message Passing Neural Network Principles, from Theory to Code
https://github.com/giuseppefutia/klab?tab=readme-ov-file#--unlocking-graph-neural-networks
51
Any GNN Layer is a Message Passing Layer
52
Any GNN Layer is a Message Passing Layer
53
MLP vs GNN
54
MLP vs GNN
55
Specializing the MP Layer…
56
Let’s consider ℓ = 0
x_i represent the initial features of node i
1st MP Layer
GCN
Graph Convolutional Network
58
From the MP Layer…
59
… to the Graph Convolutional Network (GCN) Layer
60
… to the Graph Convolutional Network (GCN) Layer
61
Let’s Focus on the GCN Layer
62
Let’s Focus on the GCN Layer
63
“Sum” as Permutation Invariance Aggregator
64
GCN - Cached Normalized Adj Matrix
65
GCN Notebook
GraphSAGE
Inductive Representation Learning
67
From the MP Layer…
68
… to GraphSAGE
69
“Mean” as Permutation Invariance Aggregator
70
Useful for bipartite graphs adopted in recommendation systems
GraphSAGE - Different Weight Matrices
71
Concatenate the neighbor features and the target features and parametrize the result with an MLP
GraphSAGE
72
GraphSAGE Notebook
73
GCN vs GraphSAGE
74
GCN vs GraphSAGE
GAT
Graph Attention Network
76
From MP Layer…
77
… to Graph Attention Network (GAT)
78
GAT - Learnable Weighted Edges
79
GAT - Let’s See How This Computation Works
80
GAT - Step 1
81
We could have W_1 and W_2 in the case of bipartite graphs
GAT - Step 1
82
GAT - Step 2
83
GAT - Step 2
84
GAT - Step 2
85
GAT - Step 2
86
GAT - Step 3
87
GAT - Step 3
88
GAT - Step 3
89
GAT - Step 3
90
GAT - Step 3
91
GCN vs GAT
92
GraphSAGE vs GAT
Layers�Graph Convolutional Networks (GCNs), GraphSAGE, Graph Attention Networks(GATs)
Our journey
Applications
Node Classification (Anti-Money Laundering), Link Prediction (Recommendation systems), LLMs (Chat with Your Graph)
Theory
Message Passing Neural Network Principles, from Theory to Code
https://github.com/giuseppefutia/klab?tab=readme-ov-file#--unlocking-graph-neural-networks
GNN Applications
A Framework for Node Classification and Link Prediction
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Node Classification
GNN
Model
Embedding Model
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
Decoder
Encoder
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Input Data
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
GNN
Model
Embedding Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Encoder
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Node Classification
GNN
Model
Embedding Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Decoder
Encoder
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Node Classification
GNN
Model
Embedding Model
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
Decoder
Encoder
Graph Processor
Node Classification
A practical Example with Bitcoin Data
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Node Classification
GNN
Model
Embedding Model
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
Decoder
Encoder
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Log Softmax + Cross Entropy Loss
Decoder
(Homogenous) GNN Model
Embedding Model
Encoder
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Graph Processor
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
elliptic_txs_features.csv
elliptic_txs_edgelist.csv
elliptic_txs_classes.csv
https://colab.research.google.com/drive/1D72FAxMUDy8VDJHgWGwwGX57J-iBaxvE?usp=sharing
104
elliptic_txs_
features.csv
elliptic_txs_�edgelist.csv
elliptic_txs_�classes.csv
Incremental ID
Mapping
edge_index
Original ID
Dropping
Class / Label Encoding
node_features
node_labels
Homogeneous Graph Data
Node Masking
Validation set
Training set
10%
Testing set
80%
10%
The Elliptic dataset is a time-series graph with 200K+ classified Bitcoin transactions (nodes), 234K payment flows (edges), and 166 anonymized node features.
Data preparation creates tensors by removing original node IDs, assigning incremental IDs, mapping them to edges, and encoding classes as numbers.
Tensors from the previous step will be added to a PyTorch Geometric Data object. A node masking approach will allocate nodes to training (80%), validation (10%), or testing (10%).
https://colab.research.google.com/drive/1D72FAxMUDy8VDJHgWGwwGX57J-iBaxvE?usp=sharing
105
GCN
SAGE
GAT
Homogeneous Graph Data
Log Softmax
Aggregated Neighbor Features
0.2
0.8
0.4
Homogeneous graph data consisting of a single node type (transaction) and a single edge type (Bitcoin flow).
The homogeneous graph data is processed using three distinct GNN encoders: Graph Convolutional Network (GCN), Graph Attention Network (GAT), and GraphSAGE (SAGE).
Each GNN encoder generates a unique node representation based on its specific aggregation and update functions.
The Log Softmax function transforms the node representation computed by a GNN encoder into a probability value.
The probability values computed by the Log Softmax function are used to determine whether a node represents a licit or illicit transaction.
https://colab.research.google.com/drive/1D72FAxMUDy8VDJHgWGwwGX57J-iBaxvE?usp=sharing
Link Prediction
A practical Example with MovieLens Dataset
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Link
Prediction
Node Classification
GNN
Model
Embedding Model
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
Decoder
Encoder
Graph Processor
Semi-structured sources consisting of multiple files that need to be transformed and processed into a graph format
Preprocessing phase to transform the input data into a graph represented as a PyTorch Geometric data structure
Dot Product + Binary Cross Entropy Loss
Node Classification
Decoder
(Heterogenous) GNN Model
Embedding Model
Encoder
Trained Model
PyG Heterogenous Data
PyG Homogenous Data
Graph Processor
Input Data
Encoding phase to transform node features into learnable embeddings using the message passing
Decoding phase to perform the downstream task on the graph, aiming to reconstruct and capture its essential properties
Output model from the encoder-decoder architecture, ready for use in the inference phase
movies.csv
ratings.csv
https://colab.research.google.com/drive/1MsPfrN1yUeRWI3TSH3oiCAVBfcpJx4rF?usp=sharing
109
movies.csv
ratings.csv
Incremental ID
Mapping
Incremental ID Mapping
movie�node_features
Heterogeneous Graph Data
Edge Splitting
Training set
80%
user-movie�edge_index
Genre Encoding
Validation set
10%
Testing set
Mini-batch Loading
10%
Batch 2
Batch 1
Batch 3
Batch 4
The MovieLens dataset contains 100,000 ratings and 3,600 tag applications for 9,000 movies, provided by 600 users.
Data preparation generates tensors by assigning incremental IDs to movies and users, encoding genres into a feature vector, and creating an edge index that connects users to movies.
Tensors from the previous step will be added to a PyTorch Geometric HeteroData object, which includes two node types (users and movies) and one edge type (user- rates-movie).
Edges within the ("user", "rates", "movie") relation are divided into training, validation, and testing sets. The training set is further split into edges for message passing and supervision. In this phase, we also generate negative examples for the validation and testing datasets.
The next step is to create mini-batches generating subgraphs for GNN input, including negative examples for training. This is essential for large graphs that exceed memory capacity.
110
Heterogeneous Graph Data
Dot Product
Aggregated Neighbor Features
0.2
0.8
0.4
Embedding
H-GraphConv
H-SAGE
H-GAT
Heterogeneous graph data consisting of two node types (users and movies) and one edge type (user-rates-movie).
The heterogeneous graph data is processed using three distinct heterogeneous GNN encoders: Graph Convolutional Network (GCN), Graph Attention Network (GAT), and GraphSAGE (SAGE).
Each heterogeneous GNN encoder generates unique representations for user and movie nodes based on its specific aggregation and update functions.
The dot product measures compatibility between users and movies based on their embeddings. A higher value indicates a greater likelihood of interaction or rating, such as a user's interest in a particular movie.
The dot product scores are converted into probabilities, indicating the likelihood of a link existing between users and movies.
Embeddings are generated for both users and movies to enhance. Since users lack intrinsic features, their embeddings are learned by the model. For movies, a feature vector encoding their genres serves as input to the embedding process.
https://colab.research.google.com/drive/1MsPfrN1yUeRWI3TSH3oiCAVBfcpJx4rF?usp=sharing
G-Retriever
“Chat with your graph” Empowered with GNNs
112
AI Agent
AI Agent’s Brain
Semantic Retriever
Vector Database
Private Data
Analyst
Answer
Which tool �should I use?
Do I have enough context to generate the answer?
KG
Retriever
KG Database
Chunking
GraphRAG-Based Agent
113
AI Agent
AI Agent’s Brain
Semantic Retriever
Vector Database
Private Data
Analyst
Answer
Which tool �should I use?
Do I have enough context to generate the answer?
KG
Retriever
KG Database
Chunking
GraphRAG-Based Agent - Vector Search
114
AI Agent
AI Agent’s Brain
Semantic Retriever
Vector Database
Private Data
Analyst
Answer
Which tool �should I use?
Do I have enough context to generate the answer?
KG
Retriever
KG Database
Chunking
GraphRAG-Based Agent - Query Generation
115
AI Agent
AI Agent’s Brain
Semantic Retriever
Vector Database
Private Data
Analyst
Answer
Which tool �should I use?
Do I have enough context to generate the answer?
KG
Retriever
GNN Semantic Retriever
KG Database
Chunking
GraphRAG-Based Agent - THE MISSING PIECE!
G-Retriever Architecture
Indexing of Textual Graph Information
Subgraph Construction
LLM�Decoder
LLM �Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Prize-Collecting Steiner Tree
Graph Encoder
GAT
Projection
MLP
Embedded Tokens
+
Graph Soft Prompt
Generated
Tokens
1. Indexing
Indexing of Textual Graph Information
Textual �node/edge�embedding
all-roberta-large-v1
WebQSP� Knowledge Graph
2. Retrieval
Indexing of Textual Graph Information
Textual �node/edge�embedding
all-roberta-large-v1
Question
WebQSP� Knowledge Graph
What is the name of Justin Bieber Brother?
2. Retrieval
Indexing of Textual Graph Information
Textual �node/edge�embedding
all-roberta-large-v1
Question
WebQSP� Knowledge Graph
What is the name of Justin Bieber Brother?
Nodes and Edge Retrieval
2. Retrieval (i)
Indexing of Textual Graph Information
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
What is the name of Justin Bieber Brother?
node_attr
justin bieber
jaxon bieber
jeremy bieber
justin bieber fan club
edge_attr
sibling
hangout
friend
children
2. Retrieval (ii)
Indexing of Textual Graph Information
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
What is the name of Justin Bieber Brother?
justin bieber
jeremy bieber
jaxon bieber
m.0gx�nnwp
parent
children
sibling
sibling
pattie malette
children
node_attr
justin bieber
jaxon bieber
jeremy bieber
justin bieber fan club
edge_attr
sibling
hangout
friend
children
3. Subgraph Construction
Indexing of Textual Graph Information
Subgraph Construction
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
justin bieber
jeremy bieber
jaxon bieber
m.0gx�nnwp
parent
children
sibling
sibling
pattie malette
children
prize: 5
prize: 3
prize: 0
prize: 1
prize: 5
prize: 5
prize: 3
cost: C
prize: 1
cost: C
Prize-Collecting Steiner Tree
3. Subgraph Construction
Indexing of Textual Graph Information
Subgraph Construction
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
justin bieber
jeremy bieber
jaxon bieber
m.0gx�nnwp
parent
children
sibling
sibling
pattie malette
children
prize: 5
prize: 3
prize: 0
prize: 1
prize: 5
prize: 5
prize: 3
cost: C
prize: 1
cost: C
Prize-Collecting Steiner Tree
3. Subgraph Construction
Indexing of Textual Graph Information
Subgraph Construction
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
justin bieber
jeremy bieber
jaxon bieber
m.0gx�nnwp
parent
children
sibling
sibling
Prize-Collecting Steiner Tree
Question and Textual Graph Embedding
Indexing of Textual Graph Information
Subgraph Construction
LLM�Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Embedded Tokens
Prize-Collecting Steiner Tree
Indexing of Textual Graph Information
Subgraph Construction
LLM �Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Embedded Tokens
What is the name of Justin Bieber Brother?
Prize-Collecting Steiner Tree
Indexing of Textual Graph Information
Subgraph Construction
LLM �Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Embedded Tokens
What is the name of Justin Bieber Brother?
node_id, node_attr
15, justin bieber
294, jaxon bieber
356, jeremy bieber
551, m.0gxnnwp
src, edge_attr, dst
151, people.person.children, 15
294, people.person.parents, 356
15, people.person.sibling_s, 551
294, people.person.sibling_s, 551
551, people.sibling_relationship.sibling, 294
551, people.sibling_relationship.sibling, 15
Prize-Collecting Steiner Tree
Graph Soft Prompt (Learnable Params)
Indexing of Textual Graph Information
Subgraph Construction
LLM �Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Graph Encoder
GAT
Projection
MLP
Embedded Tokens
+
Graph Soft Prompt
Prize-Collecting Steiner Tree
Generated Tokens
Indexing of Textual Graph Information
Subgraph Construction
LLM�Decoder
LLM �Text Encoder
Textual �node/edge�embedding
all-roberta-large-v1
Question
Nodes and Edge Retrieval
WebQSP� Knowledge Graph
Graph Encoder
GAT
Projection
MLP
Embedded Tokens
+
Graph Soft Prompt
Generated
Tokens
jaxon
bieber
Prize-Collecting Steiner Tree
130
Let’s chat with a GNN-KG!
https://colab.research.google.com/drive/1dJUYq5VbuskVnLeWrZXT4t-jz2vdr5Gl?usp=sharing
kgconf25 (45% off all Manning products; Expiration May 25)