XI INTERNATIONAL CONFERENCE
“INFORMATION TECHNOLOGY AND IMPLEMENTATION” (IT&I-2024)
Graph Databases in Electronic Communications Network: Assessment Based on Query Execution Time
Oksana Herasymenko 1
Anna Ivanytska 2
1PhD in engineering science, Assoc. Prof. of Network and Internet Technologies Department
2Bachelor's student of Networking and Internet Technologies Department
1
Graph databases are a powerful data structure that can be applied to solve a variety of problems.
They are widely used in electronic communications network due to the need for effective management of complex network structures, where traditional relational databases do not always provide sufficient performance and flexibility.
Why graph databases?
2
Paper “Graph-based deep learning for communication networks: A survey. Computer Communications”, (Jiang, W. ), - describes how graph structures are used in various network scenarios.
The following paper “A graph database for a virtualized network infrastructure”, (Jamkhedkar, P., Johnson, T.), - describes the use of a graph database, named Nepal, designed to support the automated management of networks.
Paper “Fraudetector: A graph-mining-based framework for fraudulent phone call detection”, (Tseng, V. S., Ying, J. C.), - introduces a framework for detecting fraudulent phone calls in electronic communications using graph-mining techniques.
Use cases
3
In this study, Neo4j, Memgraph and ArangoDB graph databases are considered and compared for query performance in an experiment on the same data set by measuring query execution time.
Also, a usability estimation is given for Neo4j, Memgraph and ArangoDB, which is based on number of GitHub users, number of image downloads, support ability, deployment ability and supported programming languages
Problem statement
4
Used dataset: network management
5
| Number of nodes | Number of relationships | Comments |
10less | 8474 | 12970 | Approximately 10 times less than full (in number of elements) |
5less | 16854 | 25933 | Approximately 5 times less than full (in number of elements) |
full | 83847 | 181995 |
|
Number of nodes and relationships in graphs used as datasets for the experiment
6
Queries used
a)
b)
c)
7
First group of queries measurement results
Figure 1.a
8
Figure 1.b
Second group of queries measurement results
9
Figure 2.a
Figure 2.b
Third group of queries measurement results
10
Figure 3.a
Figure 3.b
Figure 3.c
Third group of queries measurement results (continued)
11
Figure 3.d
Figure 3.e
Data for usability factor calculation
| Neo4j | Memgraph | ArangoDB |
Years on the market | 17 | 8 | 13 |
Number of image downloads | 100M | 100K | 10M |
Number of GitHub users | 704 | 321 | 111 |
User support | 1 | 1 | 0 |
Active community | 1 | 1 | 1 |
Existence of images to deploy in container | 1 | 1 | 1 |
Deployment on different cloud platform (AWS, GCP, Azure) | 3 | 3 | 2 |
Number of supported programming languages | 7 | 12 | 5 |
Usability factor | 0.92 | 0.79 | 0.42 |
12
A following equation was used to calculate usability factor
|
13
Contribution of each component to the overall usability factor value for Neo4j, Memgraph and ArangoDB databases
14
Usability factor
Conclusions
Neo4j demonstrates 2-4 times better performance for most queries compared to Memgraph. At the same time, its performance compared to ArangoDB was somewhat worse in the first group of queries.
When performing the second group of queries, Neo4j showed significantly better results compared to Memgraph, especially for larger graphs.
For the third group of queries, the results of Memgraph and Neo4j are comparable in almost all cases.
From the subjective point of view, it turned out, that Neo4j is the easiest database to work with. And Neo4j has the highest score 0.92 of the usability factor.
15
THANK YOU FOR YOUR ATTENTION!
16