Is this memory safety here in the room with us?
Halvar Flake / Thomas Dullien
DistrictCon 0 2025
Why memory safety?
�The 40000 foot view.
Why do people write software?
What I need
What I have
Abstract view
Our software is an “intended” FSM emulated on a real-world CPU - the CPU has many more states, but our intent is to restrict it to those that “make sense” as FSM states.
An unintended state is entered
An event triggers a transition into a state that is “nonsensical” or “unintended” from when viewed through the FSM lens.
Trying to transition as if it was a sane state
Further events have the software attempts to transition to the new FSM state (see red arrow), but the state is “broken”.
The weird machine
Transforming a broken state leads to a new broken state.
The weird machine
Attackers can continue driving the machine into new states, possibly reaching “all states” (or at least many that violate expected security properties).
Nested state spaces in computing
Possible physical states of the computational device
Observable states of the computational device
Documented possible states of the computational device
“Sane” states of the computational device running the software
Program execution should follow trajectories through “intended”, “sane” states
During exploitation, a state outside the “intended”, “sane” set of states is reached
The attacker carefully controls the trajectory through those “weird” states
Memory safety attempts to put an extra “wall” into this diagram.
“Memory safe” states.
More states of the machine reachable
Fewer states of the machine reachable
Small FSM
RegExp
Java/Go
Safe Rust
C/C++
Unsafe Rust
Assembly
What does memory safety provide?
Corrupting a pointer or array index or writing to it after memory has been released
throws nearly all statements about the state of the machine out of the window.
Corrupt memory tends to letting the unicorns escape
”Here be unicorns”
Why is memory corruption special?
What does memory safety provide?
If memory safety is maintained, the abstract machine that the language defines stays intact in the presence of most other bugs.
A link between the language syntax (and the source code) and behavior of the machine is maintained.
A horse stays a horse and does not grow wings and a horn.
Example: Graph of variables assignments
typeof(LHS) <—- typeof(RHS)
How is memory safety usually achieved?
Memory safety is commonly viewed as two components
In theory, you could prove for a given C/C++ program that it satisfies these properties. In practice for most codebases, this isn’t done, so languages (or hardware) are modified to have safety mechanisms.
These safety mechanisms can be implemented in runtime, during compile-time, or a combination of both.
Application logic can still become arbitrarily confused. But the goal is to avoid such a confusion to ever allow the dereference of a corrupt pointer.
Obtaining spatial safety
Spatial safety is usually (not always) obtained through the following steps:
These safety mechanisms are implemented in a combination of runtime and compile-time.
Obtaining temporal safety
There are different approaches for obtaining temporal safety. The common ones are:
All of these approaches require the coordination between the compiler/interpreter and the runtime.
The different flavors of memory safety
Flavor 1: Whole-program analysis
Flavors of memory safety: (1) Proving absence
ASTREE and Airbus Avionics
Benefits of this approach:
Downsides
Flavor 2: Garbage Collection�and runtime array checking
GC and runtime bounds checking
Garbage collection: Java, C#, Go, Python etc.
The most important wall of our time
The memory wall. Why linked lists suck. Cache rules everything around me.
Cost of garbage collection
Hertz/Berger 2005: GC heap size vs. perf tradeoff
Cycle equivalence at 5x RAM consumption.
More than 50% more cycles at 2x RAM consumption
GC everywhere: Do I pay more DRAM or more cycles?
Napkin math:
Quick note on language design and GC: Go vs. Java
Flavor 3: Reference counting�and runtime array checking
Reference counting
Pro/Cons of reference counting
Pro: Compact heap.
Con: Synchronization performance hit.
Flavor 4: Strict ownership semantics, lifetimes, and runtime array checking (Rust)
Rust’s big contribution
“Rewrite it in Rust”
Industry traction
Pro/Cons of strict ownership semantics
Pros:
Con:
Rust forces specific architectural choices on the programmer.�These are often, but not always, the right choices for the task.
(Flavor 5: C++ safety profiles and 21st century C++)
Pro/Cons of 21st century C++
Pro: Backward compatibility, incremental porting.
Con: Only exists on paper.
Current hardware approaches: �MT and CHERI
Hardware approaches: MT
Historically, most memory safety approaches were software-only.
Over the last few years, memory tagging has entered the discussion (and even implementation), which allows a limited, probabilistic form of memory-safety to be hardware-enforced.
MT modifies malloc to “tag” memory (using special instructions) with a few bits of “tag”. These bits get also stored in the 64-bit pointers ignored by the architecture (usually 57 through 63).
On memory dereference, these bits are compared (by the hardware) to the tag, and an exception is raised when they don’t match.
Relatively easily retrofitted to existing systems (but DRAM cost!)
Hardware approaches: CHERI
Custom CPU cores (historically MIPS, now RISC-V) with capabilities.
Fat pointers with bounds and permissions encoded.
Honorable mention: MiraclePtr
Retrofitting UAF safety into Chrome via adding refcounting - more mitigation than mem safety. Doesn’t help against iterator invalidation etc.
wipe sweat off brow
Observations
Local reasoning vs. global problems
Using a more powerful type system to turn global problems into locally checkable problems seems to work.
Local reasoning vs. global problems
If you squint, you are proving local properties on each function, and then composing local proofs into a whole-program proof of safety.
Copious annotations (in the form of types) are needed to make the proofs work.
As the type-checker (theorem prover) becomes more powerful, fewer annotations are needed (lifetime elision).
In the limit, the type system approach and the program analysis approach converge, from different sides.
Rust is already adding a Prolog-style theorem prover (Chalk) to the compiler to deal with implications in the type system.
TANSTAAFL
Where does this leave us?
Building a memory-safe userspace network application (such as an SMTP server etc.) is a solved problem.
We can write memory safe userspace services
Great, we are safe then?
What is not (yet?) covered by existing mechanisms?
Writing safe unsafe Rust is not easy
Shared memory TOCTOU
Surprising callbacks out of the type system
Classical C++ browser bug pattern:
If you rely on your type system to provide memory safety, the invariants of the type system need to be kept intact by any other language you call into.
This is conceptually a variant of TOCTOU shared-memory bugs, in some sense.
Issues around FFI (and GC and type systems)
Subtleties about FFIs and memory-safe languages can fill a book.
Dynamic linking in Rust was historically a nightmare:
Array indices as proto pointers
Both have their advantages and disadvantages.
Array indices as proto pointers
The pool-of-nodes approach raises a philosophical question:
JIT miscompiles
The majority of exploited browser bugs in recent years were not issues of memory safety.
Hardware errata
If the hardware misbehaves, all bets are obviously off.
GPU and xPU interactions
What’s next?
The role of LLMs and AI in the process
The role of AI
Research & Engineering ahead
Research & Engineering topics
Research & Engineering topics
With memory corruption, anything is possible
Hitler was a leftist