1 of 30

Performance Benchmarking of Node.js, Rust, Go, and Python

in Web Applications

by Igor Khokhriakov aka Ingvord

2 of 30

Disclaimer

In this talk I will scratch the surface of the software system design i.e. only the part that is connected with performance benchmarking. The whole thing is huge and won’t probably fit even into 45 min talk.

So if you will be interested to continue discussion after this session please feel free to connect on LinkedIn: https://www.linkedin.com/in/ikhokhryakov/

3 of 30

Preamble

Share some of my knowledge which I think might be useful in the context of SciCat@DESY project.

We want to make our project be:

Secure;
Scalable;
Resilient;
Stable;
Useful…

Want to show you some tools/tricks that may help to achieve our goals.

4 of 30

Preamble

Our goal: Make SciCat useful @DESY

This meeting’s goal:

Share some of my knowledge which I think might be useful in the context of

OUR GOAL.

Usefulness has many faces: functional and non-functional

5 of 30

Theory and definitions

Node – physical OR logical unit of a cluster

Instance – a running process of a software component

RPS = Requests per Second.

RPS is one of the key components/metrics when doing “System design”.

6 of 30

Capturing non-functional requirements

Is required for …

Making informed decision about budgeting, capacity planning and system design.

Performance benchmarking is required for all above prior deployment and during runtime (observability)

Reasoning about performance requires understanding of expected desired RPS

7 of 30

Software System Design

System design is analogous to planning a scientific experiment.

It involves analyzing (and DOCUMENTING!!!)* requirements, outlining the architecture, and detailing component interactions—similar to defining experiment parameters, setting up equipment, and ensuring methods interact effectively.

Key elements include:

Architecture Planning: Structuring major components and their connection.

Component Design: Specifying internal design and implementation.

Testing and Validation: Ensuring system operates as intended under various conditions.

Here is where RPS comes into play

8 of 30

Why Requests Per Second (RPS)?

Researching RPS of your system under different conditions is required for:

Stage 1: Capacity planning - how many software components/hardware is required

Stage 2: Liveness metric - if deviates from designed it indicates issues (requires observability tools*)

9 of 30

Example: System Design of a Social Network

10 of 30

Simplified System Design example

Demonstrated RPS

How to map business metrics to RPS

Estimate load = estimate cost

Model = Session (how many Requests expected)

A < B!!!

11 of 30

The goal of System Design:

A < B!!!

12 of 30

SciCat example: 1 user action = many RPS

13 of 30

Performance Benchmarking

How to measure your Node’s max RPS aka B?!

14 of 30

Two most common scenarios for capturing RPS:

IO load and CPU load

IO (Input/Output e.g. Networking, Writing to disk etc) i.e ingesting data into MongoDB

CPU - intensive computations i.e. calculating hashes, string operations e.g. routing

15 of 30

Let’s explore NodeJs+Express performance

SciCat = NestJs = NodeJs + Express

https://github.com/Ingvord/shiny-guide

16 of 30

Prerequisites

All tests were performed on a typical single-instance virtual machine, armed with 8(4*) CPU cores and 12 GB RAM

Wrk2 was used to simulate requests*

wrk -R{1000..10000} -t10 -c1000 -d30

-R – rate

-t – number of threads

-c – connections

-d – duration

Above simulates how 10_000 clients requesting during 30s with various rate

https://github.com/giltene/wrk2

17 of 30

No Magic involved

NodeJs+Express app:

Entry points:

hello – gives away static response
load – simulates work depending on the provided parameters

(cpuTime, sleepTime)

18 of 30

Baseline: 5_000 RPS

19 of 30

NodeJs IO load (single instance*, 8 instances + nginx)

Single instance

8 instances + nginx

* instance here and below means a process running on the VM

Higher is better

20 of 30

NodeJs CPU load (single instance, 8 instances + nginx)

Single instance

8 instances + nginx

https://github.com/Ingvord/shiny-guide

Higher is better

21 of 30

Bonus baseline (No Express, just plain http): 20_000K RPS

Bonus: Avoiding generic solutions boosts performance

Also makes code smaller, cleaner and much more maintainable :)

22 of 30

Rust CPU load (single instance, 4 instances + nginx)

https://github.com/Ingvord/redesigned-fishstick

Higher is better

23 of 30

Rust IO load (single instance, 4 instances + nginx)

https://github.com/Ingvord/redesigned-fishstick

Higher is better

24 of 30

DevHands.io

Kirill

Filimonov

Artem

Karpov

25 of 30

Python (FastAPI) – single instance

https://github.com/kirillF/devhands-bootcamp

https://humble-cart-ee9.notion.site/Module-2-Benchmarks-dd5062d1b15149b49b7c553fce92265a?pvs=4

IO - RPS vs Rate

CPU - Latency vs Rate

Higher is better

Lower is better

26 of 30

Go (echo)

https://gitlab.com/devhands/devhands (In Russian)

25ms CPU + 5ms IO

1ms CPU + 19 ms IO

Lower is better

27 of 30

BONUS:

check out this great resource ->

28 of 30

Conclusions

Documenting, documenting and documenting!!!

Requirements, assumptions
Technical/Non-technical requirements
Architecture outline diagram
Components and Connectors diagram
Deployment diagram

Single instance won’t stand a chance against even a moderate load

Multiple instances require orchestration (e.g. 2 instances + nginx per beamline)

Make your software Rusty

1 of 30

2 of 30

3 of 30

4 of 30

5 of 30

6 of 30

7 of 30

8 of 30

9 of 30

10 of 30

11 of 30

12 of 30

13 of 30

14 of 30

15 of 30

16 of 30

17 of 30

18 of 30

19 of 30

20 of 30

21 of 30

22 of 30

23 of 30

24 of 30

25 of 30

26 of 30

27 of 30

28 of 30

29 of 30

30 of 30