In-depth Analysis of Continuous Subgraph Matching in a Common Delta Query Compilation Framework
June, 2024
Yukyoung Lee, Kyoungmin Kim, Wonseok Lee, Wook-Shin Han
POSTECH
Data Systems Lab@POSTECH
Subgraph Matching (SM)
2
A
B
C
query graph q
B
C
B
A
Data Systems Lab@POSTECH
Continuous Subgraph Matching (CSM)
3
A
B
C
query graph q
positive match
for edge insertion
negative match
for edge deletion
Updates for the data graph
B
C
B
A
B
C
A
B
B
B
A
C
Data Systems Lab@POSTECH
Continuous Subgraph Matching (CSM)
4
A
B
C
query graph q
positive match
for edge insertion
negative match
for edge deletion
Updates for the data graph
B
C
B
A
B
C
A
B
B
B
A
C
Find all positive/negative matches (i.e., delta of matches) for each update operation
Data Systems Lab@POSTECH
Applications of CSM
5
Fraud Detection
Cyber Security
Emergency Response
Social Recommendation
Data Systems Lab@POSTECH
Existing CSM Methods
6
| Join-based (relational aspect) | Exploration-based (graph aspect) |
Maintain intermediate results | SJ-Tree | TurboFlux SymBi RapidFlow CaLiG |
Not maintain any intermediate results | Graphflow | - |
Extend partial matches by executing join
Extend partial matches by traversing adjacency list
Partial matches
Compressed partial matches
(e.g., graph, candidate vertex set)
Data Systems Lab@POSTECH
Limitations of Existing Work
7
Data Systems Lab@POSTECH
Our Approach
8
Technique in the relational database to maintain a view V incrementally
(i.e., calculate ∆V using delta query, and then apply it to V)
Data Systems Lab@POSTECH
SM Query as a Join Query
9
A
B
C
query graph q
B
C
B
A
| |
| |
| |
| |
| |
| |
| |
| | |
| | |
Database D
Query Q
Data Systems Lab@POSTECH
CSM Query as Delta Query in IVM
10
B
C
B
A
B
C
A
B
query graph q
B
B
A
C
Updates for the data graph
| |
| |
| |
| |
A
B
C
positive match
for edge insertion
negative match
for edge deletion
Data Systems Lab@POSTECH
CSM Query as Delta Query in IVM
11
B
C
B
A
B
C
A
B
query graph q
B
B
A
C
Updates for the data graph
| |
| |
| |
| |
A
B
C
positive match
for edge insertion
negative match
for edge deletion
| | |
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
| | |
Case 1. Positive match
Data Systems Lab@POSTECH
CSM Query as Delta Query in IVM
12
B
C
B
A
B
C
A
B
query graph q
B
B
A
C
Updates for the data graph
| |
| |
| |
| |
A
B
C
positive match
for edge insertion
negative match
for edge deletion
| | |
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
| | |
Case 2. Negative match
Data Systems Lab@POSTECH
CSM Query as Delta Query in IVM
13
B
C
B
A
B
C
A
B
query graph q
B
B
A
C
Updates for the data graph
| |
| |
| |
| |
A
B
C
positive match
for edge insertion
negative match
for edge deletion
| | |
| | |
Note that ∆Q just denotes what results to retrieve.
How ∆Q executes is determined by its logical/physical plan!
We also express different CSM methods (i.e., how) into different logical/physical plans of ∆Q.
Data Systems Lab@POSTECH
Advantages of Using Delta Query Plan
14
Data Systems Lab@POSTECH
Challenges
15
| Join-based (rDB aspect) | Exploration-based (graph aspect) |
Maintain intermediate results | SJ-Tree | TurboFlux SymBi RapidFlow CaLiG |
Not maintain any intermediate results | Graphflow | - |
Extend partial matches by executing join
Hard to represent their compressed partial matches as algebraic expressions [1]
[1] Mhedhbi, A., et al., “Optimizing one-time and continuous subgraph queries using worst-case optimal joins,” TODS, 2021.
Extend partial matches by traversing adjacency list
Data Systems Lab@POSTECH
Intermediate Results as Materialized View
16
Check our paper!
Data Systems Lab@POSTECH
Overview of Our Framework
17
A
B
C
Dimension values
(specification)
Query Graph
Logical Plan Generation
Physical Plan Generation
Query Plan Compilation
Execution
Offline
Online
⋈
⋈
Hash table with
key = (B, C),
value = A
Get all CSM results!
A
B
Inserted edge
Deleted edge
…
Specify which materialized views we use, …
A
B
Data Systems Lab@POSTECH
Experiments
18
Fair comparison
In-depth analysis
Data Systems Lab@POSTECH
Overall Performance (LS)
19
Tree queries (acyclic)
Graph queries (cyclic)
SymBi
Graphflow
TurboFlux
Data Systems Lab@POSTECH
GraphFlow + SymBi Techniques
20
SymBi
Graphflow
Graphflow + SymBi view
Graphflow +
SymBi view & optimization techniques
966x
Check our paper
for more information!
Data Systems Lab@POSTECH
Conclusion
21
Data Systems Lab@POSTECH
Thank you!
22
Data Systems Lab@POSTECH