1 of 16

XI INTERNATIONAL CONFERENCE

“INFORMATION TECHNOLOGY AND IMPLEMENTATION” (IT&I-2024)

Graph Databases in Electronic Communications Network: Assessment Based on Query Execution Time

Oksana Herasymenko 1

Anna Ivanytska 2

1PhD in engineering science, Assoc. Prof. of Network and Internet Technologies Department

2Bachelor's student of Networking and Internet Technologies Department

1

2 of 16

Graph databases are a powerful data structure that can be applied to solve a variety of problems.

They are widely used in electronic communications network due to the need for effective management of complex network structures, where traditional relational databases do not always provide sufficient performance and flexibility.

Why graph databases?

2

3 of 16

Paper “Graph-based deep learning for communication networks: A survey. Computer Communications”, (Jiang, W. ), - describes how graph structures are used in various network scenarios.

The following paper “A graph database for a virtualized network infrastructure”, (Jamkhedkar, P., Johnson, T.), - describes the use of a graph database, named Nepal, designed to support the automated management of networks.

Paper “Fraudetector: A graph-mining-based framework for fraudulent phone call detection”, (Tseng, V. S., Ying, J. C.), - introduces a framework for detecting fraudulent phone calls in electronic communications using graph-mining techniques.

Use cases

3

4 of 16

In this study, Neo4j, Memgraph and ArangoDB graph databases are considered and compared for query performance in an experiment on the same data set by measuring query execution time.

Also, a usability estimation is given for Neo4j, Memgraph and ArangoDB, which is based on number of GitHub users, number of image downloads, support ability, deployment ability and supported programming languages

Problem statement

4

5 of 16

Used dataset: network management

5

6 of 16

Number of nodes

Number of relationships

Comments

10less

8474

12970

Approximately 10 times less than full (in number of elements)

5less

16854

25933

Approximately 5 times less than full (in number of elements)

full

83847

181995

Number of nodes and relationships in graphs used as datasets for the experiment

6

7 of 16

Queries used

a)

b)

c)

7

8 of 16

First group of queries measurement results

Figure 1.a

8

Figure 1.b

9 of 16

Second group of queries measurement results

9

Figure 2.a

Figure 2.b

10 of 16

Third group of queries measurement results

10

Figure 3.a

Figure 3.b

Figure 3.c

11 of 16

Third group of queries measurement results (continued)

11

Figure 3.d

Figure 3.e

12 of 16

Data for usability factor calculation

Neo4j

Memgraph

ArangoDB

Years on the market

17

8

13

Number of image downloads

100M

100K

10M

Number of GitHub users

704

321

111

User support

1

1

0

Active community

1

1

1

Existence of images to deploy in container

1

1

1

Deployment on different cloud platform (AWS, GCP, Azure)

3

3

2

Number of supported programming languages

7

12

5

Usability factor

0.92

0.79

0.42

12

13 of 16

A following equation was used to calculate usability factor

13

14 of 16

Contribution of each component to the overall usability factor value for Neo4j, Memgraph and ArangoDB databases

14

Usability factor

15 of 16

Conclusions

Neo4j demonstrates 2-4 times better performance for most queries compared to Memgraph. At the same time, its performance compared to ArangoDB was somewhat worse in the first group of queries.

When performing the second group of queries, Neo4j showed significantly better results compared to Memgraph, especially for larger graphs.

For the third group of queries, the results of Memgraph and Neo4j are comparable in almost all cases.

From the subjective point of view, it turned out, that Neo4j is the easiest database to work with. And Neo4j has the highest score 0.92 of the usability factor.

15

16 of 16

THANK YOU FOR YOUR ATTENTION!

16