Advancements in �Knowledge Graph Reasoning
Innovative Approaches to Complex Logical Query Answering�and Logical Hypothesis Generation
Jiaxin Bai
KnowComp, Department of CSE�The Hong Kong University of Science and Technology
Jiaxin Bai, KnowComp, HKUST
1
4/8/2025
Roadmap
Jiaxin Bai, KnowComp, HKUST
2
4/8/2025
Limitations of Current AI Systems
Jiaxin Bai, KnowComp, HKUST
3
4/8/2025
Limitations of Current AI Systems
Jiaxin Bai, KnowComp, HKUST
4
4/8/2025
Need Grounded Reasoning on Structured Data!
For example a knowledge graph
Structured Knowledge: Knowledge Graphs
G consisting of:
5
4/8/2025
How to use KG for Reasoning?
Complex Query Answering
Logical Hypothesis Generation
Complex Session Intension Understanding
Jiaxin Bai, KnowComp, HKUST
6
4/8/2025
Roadmap
Jiaxin Bai, KnowComp, HKUST
7
4/8/2025
Complex Query Answering
How do we deal with the incompleteness of KG?
How do we scale up to large knowledge graphs and long queries?
8
Complex Queries | Interpretations |
| Find where the Canadian Turing award laureates graduated from. |
| Find the substances that interact with the proteins associated with diseases T1, T2, or T3. |
| Find entities, who are Germans, were the Nobel Prize winners and eventually moved to the United States. |
Query Encoding
How do we deal with the incompleteness of KG?
Use embeddings to represent queries and subqueries
How do we scale up to large knowledge graphs and long queries?
Using computation to replace graph search / subgraph matching
Only a single approximate nearest neighbor search in inference
9
Turing
Award
Canada
HasWinner
Intersection
HasCitizen
HasWinner
Neural Query Encoders
10
Models | Encoding Structures |
GQE [1] | Vector Embedding |
Query2Box [2] | Box Embedding |
Query2Particles [3] | Multiple Vectors Embedding |
FuzzQE [4] | Fuzzy Logic Embedding |
Neural MLP [5] | Vector Embedding |
NewLook [6] | Vector Embedding |
… | … |
Figures from [1]
Figures from [2]
Figures from [3]
Query2Particles: Knowledge Graph Reasoning with Particle Embeddings
Jiaxin Bai, Zihao Wang, Hongming Zhang, Yangqiu Song
NAACL-2022 (Findings)
Jiaxin Bai, KnowComp, HKUST
11
4/8/2025
Embedding Space and Set Representations
Jiaxin Bai, KnowComp, HKUST
12
4/8/2025
Turing
Award
Canada
Has Winner
Has Citizen
Complement
Intersection
Graduate
Computation Graph
Embedding Space
The multi-hop logical operations make the query answers diversified
The answers embeddings are set(s) scattered in the embedding space
Vector Embeddings
Box Embeddings
Particle Embeddings
Example from: Jiaxin Bai, Zihao Wang, Hongming Zhang, Yangqiu Song: Query2Particles: Knowledge Graph Reasoning with Particle Embeddings. NAACL-HLT (Findings) 2022: 2703-2714
Relational Projection
Jiaxin Bai, KnowComp, HKUST
13
4/8/2025
Relational Projection
Intersection, Union, and Negation
Jiaxin Bai, KnowComp, HKUST
14
4/8/2025
Intersection
Complement
Union
Training Query2Particles
Jiaxin Bai, KnowComp, HKUST
15
4/8/2025
Dataset
Jiaxin Bai, KnowComp, HKUST
16
4/8/2025
The basic information about the three knowledge graphs used for the experiments.
The detailed information for the queries used for training, validating, and testing all query embedding methods.
Training Query2Particles
Jiaxin Bai, KnowComp, HKUST
17
4/8/2025
2p
1p
3p
2i
3i
2in
3in
inp
pni
pin
ip
pi
2u
up
n
n
n
n
n
u
u
In-distribution types
Out-of-distribution types
p: projection
i: intersection
n: negation
u: union
In-distribution: used for training and evaluation
Out-of-distribution: no training, evaluation only
Comparison with baselines
18
Queries with Diverse Answers
19
Models | 1P | 2I | 2U | 2IN | Average |
Q2P-1P | 44.8 | 28.8 | 11.3 | 15.0 | 25.0 |
Q2P-2P | 49.4 | 35.5 | 13.3 | 20.7 | 29.7 |
Q2P-3P | 53.0 | 37.7 | 18.6 | 21.6 | 32.8 |
MRR on the top ten percent diversified queries
Diversity is measured by the number of answers.
Significantly better on the queries with diverse answers
Sequential Query Encoding For Complex Query Answering on Knowledge Graphs
Jiaxin Bai*, Tianshi Zheng*, Yangqiu Song
Transactions of Machine Learning Research
Jiaxin Bai, KnowComp, HKUST
20
4/8/2025
Neural Networks as Operators
Jiaxin Bai, KnowComp, HKUST
21
4/8/2025
Interact
Mad Cow Disease
Alzheimer’s Disease
Union
Assoc
Answers
Assoc
From a query to a computation graph
Do we have to parameterize and then execute such computational graph?
From Computing to Encoding
Jiaxin Bai, KnowComp, HKUST
22
4/8/2025
[(] [P] [Interact]
[(] [U]
[(] [P] [Assoc] [MadCow] [)]
[(] [P] [Assoc] [Alzheimer ] [)]
[)]
[)]
Interact
Mad Cow Disease
Alzheimer’s Disease
Union
Assoc
Answers
Assoc
Tokenization
Sequential Query Encoding
Jiaxin Bai, KnowComp, HKUST
23
4/8/2025
Sequence Encoder
[(] [P] [Interact] [(] [U] [(] [P] [Assoc] [MadCow] [)] [(] [P] [Assoc] [Alzheimer] [)] [)] [)]
[Melanin]
E[(]
E[P]
E[(]
C
E[U]
E[(]
E[Interact]
E[P]
E[Assoc]
E[MadCow]
…
…
E[)]
E[(]
E[P]
E[Assoc]
E[Alzheimer]
E[)]
E[)]
E[)]
Sequence Encoder: Transformers, LSTMs, Temporal CNN…
Experiments - Benchmarks
Jiaxin Bai, KnowComp, HKUST
24
4/8/2025
Dataset | In-distribution Types | Out-of-distribution Types | Total |
Query2Box | 5 | 4 | 9 |
BetaE | 10 | 4 | 14 |
SMORE | 10 | 4 | 14 |
This Paper (SQE) | 29 | 29 | 58 |
We construct a larger benchmark with diverse query types.
Results
Jiaxin Bai, KnowComp, HKUST
25
4/8/2025
Datasets | Models | In-distribution Queries | Out-of-distribution Queries | ||
Entailment | Inference | Entailment | Inference | ||
FB15K-237 | ConE | 36.69 | 9.75 | 28.13 | 8.82 |
BetaE | 32.48 | 8.30 | 22.96 | 7.29 | |
Q2P | 52.33 | 10.17 | 32.70 | 8.62 | |
Neural MLP | 51.09 | 10.03 | 36.85 | 8.75 | |
+ MLP Mixer | 45.19 | 10.07 | 33.03 | 8.66 | |
SQE + CNN | 52.09 | 10.14 | 28.21 | 7.65 | |
SQE + GRU | 55.46 | 10.59 | 32.25 | 8.34 | |
SQE + LSTM | 56.02 | 10.62 | 33.41 | 8.62 | |
SQE + Transformer | 59.15 | 11.30 | 15.06 | 4.98 | |
The operator-level parametrization is good for compositional generalization
for unseen query types
Sequential query encoding is good for the queries when its query types are seen
Knowledge Graph Reasoning over Entities and Numerical Values
Jiaxin Bai, Chen Luo, Zheng Li, Qingyu Yin, Bing Yin, Yangqiu Song
KDD-2023
Jiaxin Bai, KnowComp, HKUST
26
4/8/2025
Numerical Complex Query Answering
Jiaxin Bai, KnowComp, HKUST
27
4/8/2025
Interpretations
Complex Queries
Category
1927.
Find the Turing award winners
that
is born before
the year of
Numerical CQA
Find the states in US that have
a
higher latitudes
than Beijing.
Numerical CQA
Find the states in US that have a
twice smaller population
than
California?
Numerical CQA
Number Reasoning Network
Jiaxin Bai, KnowComp, HKUST
28
4/8/2025
Find the cities that have a higher latitudes than Japanese cities.
(1) Relational
Projection
(2) Attribute
Projection
(3) Numerical
Projection
(4) Reverse
Attribute Projection
50°
30°
40°
20°
50°
30°
40°
20°
Jiaxin Bai, Chen Luo, Zheng Li, Qingyu Yin, Bing Yin, Yangqiu Song: Knowledge Graph Reasoning over Entities and Numerical Values. KDD 2023: 57-68
Main Results on Three KGs
Jiaxin Bai, KnowComp, HKUST
29
4/8/2025
Query Encoding | Attribute | Hit@1 | Hit@3 | Hit@10 | MRR |
GQE | Baseline | 10.33 | 18.19 | 27.91 | 16.29 |
NRN + DICE | 11.03 | 19.18 | 29.01 | 17.15 | |
NRN + Sinusoidal | 11.14 | 19.39 | 29.23 | 17.31 | |
Q2P | Baseline | 10.22 | 17.35 | 26.61 | 15.81 |
NRN + DICE | 11.86 | 19.70 | 29.46 | 17.84 | |
NRN + Sinusoidal | 12.25 | 20.16 | 29.96 | 18.28 | |
Q2B | Baseline | 11.81 | 20.93 | 31.19 | 18.41 |
NRN + DICE | 12.52 | 22.09 | 32.34 | 19.34 | |
NRN + Sinusoidal | 12.75 | 22.22 | 32.46 | 19.51 |
Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints
Jiaxin Bai, Xin Liu, Weiqi Wang, Chen Luo, Yangqiu Song
NeurIPS-2023
Jiaxin Bai, KnowComp, HKUST
30
4/8/2025
31
ASER (Activities, States, Events, and their Relations)
https://github.com/HKUST-KnowComp/ASER
Hongming Zhang, Xin Liu, Haojie Pan, Yangqiu Song, Cane Wing-Ki Leung: ASER: A Large-scale Eventuality Knowledge Graph. WWW 2020: 201-211
Katz, J. J., & Fodor, J. A. (1963). The structure of a semantic theory. Language, 39(2), 170–210.
Yorick Wilks. 1975. An intelligent analyzer and understander of English. Communications of the ACM, 18(5):264–274.
Principle 1: Comparing semantic meanings by fixing grammar (Katz and Fodor, 1963)
Principle 2: The need of language inference based on ‘partial information’ (Wilks, 1975)
CQA on Eventuality Knowledge Graph
Jiaxin Bai, KnowComp, HKUST
32
4/8/2025
Complex query on eventuality graphs are different from the entity-relation graph
Whether and when the eventualities occur are important
Interpretations
Type
Queries
Find the substances that interact with the
proteins associated with Alzheimer’s and
Mad cow disease.
Entity
Instead
of buying an umbrella,
PersonX
go
home.
What happened before
PersonX
go
home?
Eventuality
Food
is bad before
PersonX
add soy sauce.
What is the reason for food being bad?
Eventuality
Jiaxin Bai, Xin Liu, Weiqi Wang, Chen Luo, Yangqiu Song: Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints. NeurIPS, 2023
Query Encoding with Constraint Memory
Jiaxin Bai, KnowComp, HKUST
33
4/8/2025
V?
PersonX
complains
PersonX leaves
restaurant
Succession
PersonY
adds ketchup
Computational Graph
Constraint Memory
Food is bad
…
Precedence
ChosenAlter.
PersonY
adds vinegar
PersonY
adds soy sauce
…
…
Key
Value
Succession
Intersection
Reason
(1)
(2)
(3)
Jiaxin Bai, Xin Liu, Weiqi Wang, Chen Luo, Yangqiu Song: Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints. NeurIPS, 2023
The MEQE Combined with Various QE methods
Jiaxin Bai, KnowComp, HKUST
34
4/8/2025
Models | Occurrence Constraints | Temporal Constraints | Average | ||||||
Hit@1 | Hit@3 | MRR | Hit@1 | Hit@3 | MRR | Hit@1 | Hit@3 | MRR | |
GQE | 8.92 | 14.21 | 13.09 | 9.09 | 14.03 | 12.94 | 9.12 | 14.12 | 13.02 |
+ MEQE | 10.20 | 15.54 | 14.31 | 10.70 | 15.67 | 14.50 | 10.45 | 15.60 | 14.41 |
Q2P | 14.14 | 19.97 | 18.84 | 14.48 | 19.69 | 18.68 | 14.31 | 19.83 | 18.76 |
+ MEQE | 15.15 | 20.67 | 19.38 | 16.06 | 20.82 | 19.74 | 15.61 | 20.74 | 19.56 |
Nerual MLP | 13.03 | 19.21 | 17.75 | 13.45 | 19.06 | 17.68 | 13.24 | 19.14 | 17.71 |
+ MEQE | 15.26 | 20.69 | 19.32 | 15.91 | 20.63 | 19.47 | 15.58 | 20.66 | 19.40 |
FuzzQE | 11.68 | 18.64 | 17.07 | 11.68 | 17.97 | 16.53 | 11.68 | 18.31 | 16.80 |
+ MEQE | 14.76 | 21.12 | 19.45 | 15.31 | 21.01 | 19.49 | 15.03 | 21.06 | 19.47 |
Roadmap
Jiaxin Bai, KnowComp, HKUST
35
4/8/2025
Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation
Jiaxin Bai*, Yicheng Wang*, Tianshi Zheng, Yue Guo, Xin Liu, Yangqiu Song
ACL-2024
Jiaxin Bai, KnowComp, HKUST
36
4/8/2025
Abductive Reasoning
To use structured knowledge in KG to explain observations.
Jiaxin Bai, KnowComp, HKUST
37
4/8/2025
Abductive Reasoning
To use structured knowledge in KG to explain observations.
Jiaxin Bai, KnowComp, HKUST
38
4/8/2025
Observations (O) | Hypotheses (H) | Hypotheses Interpretations |
| | The actors and screenwriters born in Los Angeles |
| | The Apple products released in 2010 that are not phones |
| | The disease whose symptoms can be relieved by Panadol |
Tokenization of Hypothesis
Jiaxin Bai, KnowComp, HKUST
39
4/8/2025
1
2
7
3
5
8
4
6
9
[Apple]
[2010]
[Phone]
[Type]
[Release]
[Brand]
[I]
[I]
[N]
[I]
[I]
Tokens : [I][I][Brand][Apple]
[Release][2010][N][Type][Phone]
Complex Logical Hypothesis Generation
Jiaxin Bai, KnowComp, HKUST
40
4/8/2025
Step 1:
Sample observation-hypothesis pairs.
Observations
Hypotheses
KG:
Step 2:
Train hypothesis generation model by using teacher forcing.
Hypothesis Generation Model
Observations
Generated Hypotheses
Complex Logical Hypothesis Generation
Jiaxin Bai, KnowComp, HKUST
41
4/8/2025
Observation
Generated Hypothesis
KG
Hypothesis Conclusion
Jaccard
PPO Training
Policy Gradient Optimization
Model
Reference Model
Log-probabilities
Log-probabilities
KL-Div
Step 3:
Optimize hypothesis generation model with Reinforcement Learning From Knowledge Graph feedback (RLF-KG).
Dataset
Jiaxin Bai, KnowComp, HKUST
42
4/8/2025
This figure provides basic information about the three knowledge graphs utilized in our experiments. The graphs are divided into standard sets of training, validation, and testing edges to facilitate the evaluation process.
The detailed information about the queries used for training, validation, and testing.
Performance
Jiaxin Bai, KnowComp, HKUST
43
4/8/2025
Dataset | Model | 1p | 2p | 2i | 3i | ip | pi | 2u | up | 2in | 3in | pni | pin | inp | Ave. |
FB15k-237 | Enc.-Dec. | 0.626 | 0.617 | 0.551 | 0.513 | 0.576 | 0.493 | 0.818 | 0.613 | 0.532 | 0.451 | 0.499 | 0.529 | 0.533 | 0.565 |
+ RLF-KG | 0.855 | 0.711 | 0.661 | 0.595 | 0.715 | 0.608 | 0.776 | 0.698 | 0.670 | 0.530 | 0.617 | 0.590 | 0.637 | 0.666 | |
Dec.-Only | 0.666 | 0.643 | 0.593 | 0.554 | 0.612 | 0.533 | 0.807 | 0.638 | 0.588 | 0.503 | 0.549 | 0.559 | 0.564 | 0.601 | |
+ RLF-KG | 0.789 | 0.681 | 0.656 | 0.605 | 0.683 | 0.600 | 0.817 | 0.672 | 0.672 | 0.560 | 0.627 | 0.596 | 0.626 | 0.660 |
Real Examples:
Jiaxin Bai, KnowComp, HKUST
44
4/8/2025
Jiaxin Bai, KnowComp, HKUST
45
4/8/2025
Output from Supervised Training:
Output after RLF-KG Training:
Roadmap
Jiaxin Bai, KnowComp, HKUST
46
4/8/2025
Understanding Inter-Session Intentions via Complex Logical Reasoning
Jiaxin Bai, Chen Luo, Zheng Li, Qingyu Yin, Yangqiu Song
KDD-2024
Jiaxin Bai, KnowComp, HKUST
47
4/8/2025
Search with Logic is Hard
48
4/8/2025
Search with hidden intentions from sessions
49
4/8/2025
Integrating sessions, attributes, and logics
For Product Recommendation
50
4/8/2025
Find next item of a given session
Find an item with the brand Nike
Find an item with the brand Adidas or Nike
Or
Find an item with the brand Adidas
Find the next item of a session with the brand Nike or Adidas
And
Find the next item of a session with the brand Nike or Adidas
Logical Session-CQA on Hypergraph
51
4/8/2025
(A) Hypergraph
(B) Hyper-Relational KG
Discovered By
Albert
Einstein
Photoelectric
Effect
Educated At
Degree: BSc
ETH
Zurich
(C) Hyper Session Graph
Item1 Item2 Item3 Item4
Session1
Red
Blue
Nike
Adidas
Brand
Colour
Brand
Colour
Brand
Session2
Hyperedge1
Hyperedge2
CQA Methods for Inter-Session Logic Reasoning
52
N-ary QE methods
StarQE [1] and NQE [2] are designed for hyper-relational KG, difficult to be adopted to session graphs.
SQE [3] can be extended to N-ary quires, but it cannot capture some important aspects of query graph, like permutation invariance in AND and OR.
Session Encoders + Logic Encoders
When doing logic reasoning, the logic encoder can only access session embedding but not the detailed items in the session in the reasoning process.
Brand
Intersection
Next
Next
[(] [P] [Brand]
[(] [I]
[(] [P] [Next] [(] [S] [Item1,1] … [Item1,m] [)] [)]
[(] [P] [Next] [(] [S] [Item2,1] … [Item2,n-1] [Item2,n] [)] [)]
[)]
[)]
Need a new query encoding method on session hypergraph
[1] Dimitrios Alivanistos, Max Berrendorf, Michael Cochez, Mikhail Galkin: Query Embedding on Hyper-Relational Knowledge Graphs. ICLR 2022
[2] Haoran Luo, Haihong E, Yuhao Yang, Gengxian Zhou, Yikai Guo, Tianyu Yao, Zichen Tang, Xueyuan Lin, Kaiyang Wan: NQE: N-ary Query Embedding for Complex Query Answering over Hyper-Relational Knowledge Graphs. AAAI 2023
[3] Jiaxin Bai, Tianshi Zheng, Yangqiu Song: Sequential Query Encoding for Complex Query Answering on Knowledge Graphs. Trans. Mach. Learn. Res. 2023 (2023)
Logical-Session Graph Transformer
53
4/8/2025
Brand
Intersection
Next
Next
S2
S1
I1
P2
P1
P3
v1
v2
v3
v2
v3
v4
Query Graph
Find the item that is desired by the session1 and desired by session2.
Logical-Session Graph Transformer
54
4/8/2025
Brand
Intersection
Next
Next
S2
S1
I1
P2
P1
P3
v1
v2
v3
v2
v3
v4
Query Graph
Node identifiers (from 0 to 9) are assigned to each node, involving items, sessions, and operators in the graph
v1
v2
v3
v4
S1
S2
P1
P2
P3
I1
1
2
3
4
5
6
7
8
9
0
Logical-Session Graph Transformer
55
4/8/2025
Transformer Encoder
[I]
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
[g]
[v1]
[v2]
[v3]
[v4]
[S]
[S]
[P]
[P]
[P]
v
v
v
v
v
v
v
v
v
v
Items and Operators
[3]
[2]
1
2
0
4
4
4
1
2
3
5
5
5
[1]
[3]
[2]
[1]
e
e
e
e
e
e
Sessions Structures
4
6
5
7
6
9
7
9
9
8
[Next]
[Next]
[I]
[I]
[Brand]
e
e
e
e
e
Logical Structures
[Predictions]
Type
Identifiers:
v
[node]:
[edge]:
e
Node
Identifiers:
v1
v2
v3
v4
S1
S2
P1
P2
P3
I1
1
2
3
4
5
6
7
8
9
0
Brand
Intersection
Next
Next
S2
S1
I1
P2
P1
P3
v1
v2
v3
v2
v3
v4
Logical-Session Graph Transformer
56
4/8/2025
Brand
Intersection
Next
Next
Transformer Encoder
S2
S1
I1
P2
P1
P3
v1
v2
v3
v2
v3
v4
Node
Identifiers:
v1
v2
v3
v4
S1
S2
P1
P2
P3
I1
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
[2]
1
2
0
4
4
4
1
2
3
5
5
5
4
6
5
7
6
9
7
9
9
8
[g]
[v1]
[v2]
[v3]
[v4]
[S]
[S]
[P]
[P]
[P]
[I]
[1]
[3]
[2]
[1]
[3]
[Next]
[Next]
[I]
[I]
[Brand]
v
v
v
v
v
v
v
v
v
v
e
e
e
e
e
e
e
e
e
e
e
Items and Operators
Sessions Structures
Logical Structures
[Predictions]
Type
Identifiers:
v
[node]:
[edge]:
e
(A)
(B)
(C)
[1] Jinwoo Kim, Dat Nguyen, Seonwoo Min, Sungjun Cho, Moontae Lee, Honglak Lee, Seunghoon Hong: Pure Transformers are Powerful Graph Learners. NeurIPS 2022
Experiment Dataset
57
4/8/2025
Experiment Dataset
58
4/8/2025
| Train Graph | Validation Graph | Test Graph | | | | | |||
Dataset | Vertices | Edges | Vertices | Edges | Vertices | Edges | #Sessions | #Items | #Values | #Relations |
Amazon | 2,258,179 | 7,234,680 | 2,345,475 | 7,620,527 | 2,431,747 | 8,004,984 | 720,816 | 431,036 | 1,279,895 | 10 |
Diginetica | 257,018 | 1,286,384 | 261,996 | 1,337,628 | 266,897 | 1,387,861 | 12,047 | 134,904 | 125,204 | 3 |
Dressipi | 611,520 | 2,435,932 | 643,140 | 2,567,128 | 674,853 | 2,698,692 | 668,650 | 23,618 | 903 | 74 |
| Train Queries | Validation Queries | Test Queries | |
Dataset | Item-Attributes | Others | All Types | All Types |
Amazon | 2,535,506 | 720,816 | 36,041 | 36,041 |
Diginetica | 249,562 | 60,235 | 3,012 | 3,012 |
Dressipi | 414,083 | 668,650 | 33,433 | 33,433 |
59
4/8/2025
1p
2p
3i
2iA
2iS
ip
pi
2uS
up
2inS
2inA
inp
pin
3in
3iA
u
u
n
n
n
n
n
3ip
3inA
n
3inp
n
Zero-shot Types:
Supervised Training Types:
Experiment Results
60
4/8/2025
Dataset | Query Encoder | Session Encoder | Average-EPFO | Average-Negation |
Amazon | FuzzQE | GRURec | 30.94 | 21.03 |
SRGNN | 31.75 | 22.26 | ||
Attn-Mixer | 31.68 | 25.00 | ||
Q2P | GRURec | 23.09 | 14.61 | |
SRGNN | 24.59 | 16.62 | ||
Attn-Mixer | 28.16 | 26.95 | ||
NQE | - | 23.19 | 18.12 | |
SQE-Transformer | - | 30.07 | 27.16 | |
SQE-LSTM | - | 32.53 | 27.13 | |
LSGT (Ours) | | 33.26 | 29.69 |
Existential Positive First Order (EPFO): queries types that involves conjunction, disjunction, and variables are existential quantified. No negations.
Out-of-Distribution Queries Results
61
4/8/2025
Dataset | Query Encoder | 3iA | 3ip | 3inA | 3inp | Average OOD |
Amazon | FuzzQE + Attn-Mixer | 66.72 | 29.67 | 54.33 | 48.76 | 49.87 |
Q2P + Attn-Mixer | 33.51 | 11.42 | 51.47 | 41.46 | 34.47 | |
NQE | 61.72 | 1.98 | 46.47 | 34,04 | 36.72 | |
SQE + Transformers | 66.03 | 28.41 | 55.61 | 51.28 | 50.33 | |
LSGT (Ours) | 68.44 | 34.22 | 58.50 | 51.49 | 53.16 | |
Diginetica | FuzzQE + Attn-Mixer | 88.30 | 32.88 | 82.75 | 34.50 | 59.61 |
Q2P + Attn-Mixer | 40.28 | 43.93 | 54.31 | 48.20 | 46.68 | |
NQE | 86.25 | 20.79 | 64.74 | 20.93 | 48.18 | |
SQE + Transformers | 88.05 | 31.33 | 81.77 | 35.83 | 59.25 | |
LSGT (Ours) | 91.71 | 35.24 | 83.30 | 41.05 | 62.83 | |
Dressipi | FuzzQE + Attn-Mixer | 65.43 | 95.64 | 53.36 | 97.75 | 78.05 |
Q2P + Attn-Mixer | 60.64 | 96.78 | 52.22 | 97.28 | 76.73 | |
NQE | 31.96 | 96.18 | 9.89 | 97.80 | 58.96 | |
SQE + Transformers | 72.61 | 97.12 | 55.20 | 98.14 | 80.77 | |
LSGT (Ours) | 74.34 | 97.30 | 58.30 | 98.23 | 82.04 |
OOD query types: queries types that are not trained during the training phrase.
As their sub-queries are trained, we can use this as a measure of compositional generalization.
Roadmap
Jiaxin Bai, KnowComp, HKUST
62
4/8/2025
Future Work: Neural Graph Database
Jiaxin Bai, KnowComp, HKUST
63
4/8/2025
Text Data
Database Data
Neural Graph Database
🡪 Database Foundation Model
Large Language Model
[1] https://youtu.be/1yvBqasHLZs
[2] Wang, Y., Wang, X., Gan, Q., Wang, M., Yang, Q., Wipf, D., & Zhang, M. (2025). Griffin: Towards a Graph-Centric Relational Database Foundation Model. arXiv preprint arXiv:2505.05568.
[2]
Ilya Sutskever [1]
LLM Pre-training scaling will stop
because we only have one internet!
Neural Graph Database (NGDB) show great potential to scale to Database Foundation Model!
Future Work – Agentic Database
Jiaxin Bai, KnowComp, HKUST
64
4/8/2025
Bengio
LeCun
Knuth
1964
1947
1938
UofT
Toronto
Stanford
Montreal
New York
Hinton
Turing Award
Private edge
Public edge
Neural Graph Databases
Embedding Storage
Query Engine
Embedding
Language Modeling via Next Token Prediction:
We would like to invite esteemed senior professors who have made significant contributions to computer science to give a talk on-site. To accommodate them, we have decided to hold the seminar in ________
Task 1
Task 1: Generating Good NGDB Queries.
Task 2
Task 2: Improving NGDB Reasoning.
…… we have decided to hold the seminar in __Toronto__.
Task 3: Incorporating NGDB Results.
Task 3
Neural Graph Database Reasoning:
Where are the Turing Award Winners born before 1940 lives in?
Task 4: Application of Agentic NGDB
Task 4
Large Language Model
Thank you for your attention! ☺�More related work on my website:�bjx.fun�
66