1 of 43

The Rust Invasion

New Possibilities in the Python Ecosystem

Terence Liu @terencezliu

2 of 43

2

https://fineartamerica.com/featured/british-invasion-encore-ron-magnes.html

3 of 43

3

4 of 43

4

5 of 43

cryptography Changelog

5

6 of 43

Alternatives to alpine

  • musl vs glibc - subtle but important differences.
  • Alternatives: distroless, wolfi

6

7 of 43

Ruff (2022 - 2023)

7

8 of 43

Ruff (2022 - 2023)

8

9 of 43

uv (2024)

9

10 of 43

uv (2024)

10

11 of 43

The Buzz

11

https://x.com/gjbernat/status/1836843228030505053

12 of 43

12

13 of 43

How Do You Develop With Python?

  • On Mac:
    • System Python - might be too old
    • Homebrew Python - rugpulls too often
  • On Linux:
    • Package manager
  • On Windows:
    • Microsoft Store
  • In general:
    • python.org
    • conda / mamba / micromamba
    • Run in Docker / Docker Compose and mirror outside for editor integration

13

14 of 43

Deploy Python Apps to the Cloud

  • Dockerfiles
  • Multi-stage builds

14

15 of 43

uv Showcase (uv venv & uv pip install)

15

Big project - 250s -> 30s (no cache)

16 of 43

uv Showcase (uv pip list & uv pip tree)

16

17 of 43

uv Showcase (uv python)

17

18 of 43

uv Showcase (uv init & uv add & uv sync)

18

19 of 43

19

20 of 43

uv Showcase (Installation and Upgrades)

  • Static binaries for fun and profit

20

21 of 43

Other Static Binary Package Managers

  • Micromamba (C++) - based on conda-forge
    • Like conda / mamba, but without base environment

21

22 of 43

Other Static Binary Package Managers

  • Pixi (Rust) - based on conda-forge
    • Env per project, instead of globally located
    • Up to 4x faster than micromamba

22

23 of 43

Pattern Spotted!

  • Per-project envs are good!

23

24 of 43

Influence: Cargo - the Rust Package Manager

  • Poetry, uv, and Pixi all took inspiration from Cargo
    • All have taken inspiration from package managers in the past
  • Package managers themselves are getting “oxidized”.

24

25 of 43

25

26 of 43

Welcome. Fellow Rustaceans…

26

27 of 43

Let’s Talk about Python…

27

28 of 43

Why is Python Slow?

  • Interpreted, rather than machine code
  • Dynamic typing: runtime checking overhead, hard to optimize
  • All objects are heap-allocated (PyObject) with structure even for simple data
  • Garbage collection: extra work tracking objects, memory fragmentation
  • Global Interpreter Lock (GIL): in CPython, the GIL limits parallelism

28

This might change! #nogil

29 of 43

But Python is the language for Scientific Computing, Data, and AI

  • Python is the “glue” layer for compiled (“native”) libraries. Other than Python:
    • NumPy - C & Cython
    • SciPy - C, C++, Fortran, Cython & Pythran
    • Scikit-learn: C++ & Cython
    • Pandas - C & Cython
    • lxml - Cython, with libxml2 & libxslt in C
    • Msgpack-Python - C & Cython
    • Psycopg2 - C
    • Cryptography - Previously C / now Rust (PyO3 binding)
    • SpaCy - Cython
    • Pillow - C
    • OpenCV-Python - C++
    • PyTorch - C++ & CUDA
    • Dlib - C++ (pybind11 binding)
    • Faiss - C++ & CUDA (SWIG binding)
    • Hnswlib - C++ (pybind11 binding)

29

30 of 43

Awesome Rust Projects with Python Bindings

  • orjson - Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
  • ormsgpack - About Msgpack serialization/deserialization library for Python, written in Rust using PyO3 and rust-msgpack
  • Polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
  • Tokenizers - Fast State-of-the-Art Tokenizers optimized for Research and Production
  • pydantic-core - Core validation logic for pydantic written in rust
  • py-spy - Sampling profiler for Python programs
  • DataFusion - Apache DataFusion Apache DataFusion SQL Query Engine With Python Bindings
  • Delta Lake - A native Rust library for Delta Lake, with bindings into Python

30

31 of 43

What to Use for Faster Code?

  • A few sane choices
    • Numba - Just-in-Time (JIT) compilation for simple / numerical routines
    • Cython for C code
    • pybind11 for C++ code
    • PyO3 for Rust code

31

32 of 43

Rust as the “Glue” for C/C++

  • Just like Python is the “glue” for compiled libraries, Rust can be the “glue” for C/C++ libraries.
    • They all can speak the C Foreign Language Interface (CFFI)
    • Drastically broadens the world of software options
    • Encourages memory safety of C/C++ projects through osmosis at interface level
    • e.g.:
      • libc - Raw FFI bindings to platforms' system libraries
      • magic - High level bindings for the `libmagic` C library
      • libsql - libSQL library: the main gateway for interacting with the database
      • rocksdb - Rust wrapper for Facebook's RocksDB embeddable database
      • tch-rs - Rust wrappers for the PyTorch C++ api (libtorch).

32

33 of 43

Side Note: Memory Safety Pressuring the C/C++ Community

33

https://twitter.com/seanbax/status/1839030830968012989

34 of 43

Rust/Python Interop Game Plan

  • Move performance-critical work one layer lower from Python to Rust, and expose coarse-grained operations to Python
    • Many small operations - call into Rust once and do all of them there
    • Further: leave threading to Rust, no Python threads
    • Further: use async Rust for big-batch network IO and wrap in sync call in Python
  • Code once in Rust, profit also in other languages through bindings

34

35 of 43

Accelerating Msgpack Data Pipeline - Setup

35

36 of 43

Accelerating Msgpack Data Pipeline - Python Impl

36

37 of 43

Accelerating Msgpack Data Pipeline - Rust Impl

37

. . .

38 of 43

38

. . .

39 of 43

Accelerating Msgpack Data Pipeline - Rust Impl

39

40 of 43

Speed-Up

40

1 thread

4 threads

41 of 43

If Reading from File and Deserialize in Rust…

41

1 thread

4 threads

42 of 43

Awesome-Rust - Rich and Thriving Ecosystem

42

43 of 43

Thank you!

43