V8 Sandbox - External Pointer Sandboxing

Author: saelo@

First Published: July 2022

Last Updated: July 2022

Status: Living Doc

Visibility: PUBLIC

This document is part of the V8 Sandbox Project and discusses the design of the external pointer tables used to allow for memory-safe access to objects outside the V8 sandbox. For a general overview of the sandbox design see the high-level design document.

Background

The goal of the V8 sandbox is to prevent an attacker with the ability to corrupt V8 (heap) memory from corrupting memory outside of the sandbox. This is achieved, in effect, by removing all raw pointers from the V8 heaps and turning them either into an offset from the start of the sandbox address space (a “sandboxed pointer”) or an index into a table of pointers, the “external pointer table” (in the following abbreviated as “EPT”), located outside of the sandbox. This mechanism is similar in principle to the Unix kernel file descriptor table.

As an example, consider the following JavaScript code snippet:

let elem = document.createElement("p");

elem.textContent = "foobar";

The first line will allocate two different objects: a C++ HTMLParagraphElement in Blink and a V8HTMLParagraphElement on the V8 heap which acts as a wrapper around the C++ object and makes it accessible to JavaScript code. When the second line of code is executed, V8 will load the pointer to the wrapped HTMLParagraphElement from the V8 wrapper object and call the C++ setTextContent method on it.

The following shows a simplified in-memory view of the V8 wrapper object without sandboxing:

(lldb) x/4wx 0x3771ac5904b0

0x3771ac5904b0: 0x08a4bb09 0x080406e1 0x080406e1 0x02c144b0

0x3771ac5904c0: 0x00000138

Here, the first three fields are compressed pointers to the Map object and the Properties and Elements buffer respectively. The fourth and fifth field (green) together contain a raw pointer (0x0000013802c144b0) to the wrapped C++ HTMLParagraphElement object. As the V8 HeapObject can be corrupted by an attacker, this raw pointer can be abused to corrupt memory outside the sandbox and thereby escape from it. To prevent this, the external pointer must be sandboxed as shown conceptually in the following in-memory view:

(lldb) x/4wx 0x3771ac5904b0

0x3771ac5904b0: 0x08a4bb09 0x080406e1 0x080406e1 0x00000123

Here the fourth field now instead contains an index (0x123) into an EPT in which the actual pointer to the HTMLParagraphElement is stored. This index is also called an external pointer handle. As the table is located outside the sandbox, an attacker can only corrupt the handles, but not the actual pointers. Memory safety is then achieved if

  1. the table cannot be accessed out-of-bounds through a corrupted handle
  2. all entries in the table are either valid pointers to live objects or values that are guaranteed to crash when dereferenced as a pointer
  3. there is a type check whenever a pointer is loaded from an EPT to ensure that the external object has the expected type
  4. a thread cannot gain access to external objects owned by another thread (unless they are explicitly shared)
  5. external objects of the same type can always safely be substituted for each other

The rest of this document discusses how each of these points is achieved with minimal performance overhead.

Design

The proposed EPT design is based on fixed-size, garbage collected, per-Isolate tables using shifted indices as handles. Entries use inline marking bits and inline, AND-based type tags for type checking. Each of these features is discussed next.

Per-Isolate Tables

All V8 Isolates in the process share the same Sandbox, and so multiple threads can execute V8 code at the same time and operate on the same HeapObjects. As it cannot generally be assumed that external objects can safely be accessed from multiple threads, they should by default only be accessible to the thread that “owns” them (i.e. that created the EPT entry for them). This is achieved by having a separate EPT for each Isolate. Then, when multiple threads are executing V8 code at the same time, they will have different “active” Isolates, and therefore use different EPTs when resolving external pointer handles. External objects that should be accessible to multiple threads can still be shared explicitly, either by creating entries for them in multiple EPTs or by having an entry for them in the shared Isolate’s EPT.

Shifted Indices as Handles

As the handles referencing entries in an EPT are stored on the heap where they can be corrupted by an attacker, it is necessary to ensure that every access to the table does not go out of bounds. Instead of performing a bounds check on every access, a fixed-size (e.g. 128MB) region of virtual memory is reserved for the table during initialization. The table is then never reallocated, but only grown/shrunk in place.

The handles are simply shifted indices into the table, which guarantees that the index, after unshifting, is always less than the number of entries in the table. Returning to the example given at the beginning of the document, assuming a 128MB table (and therefore 16M 8-byte entries) the index 0x123 would be stored as the 32-bit unsigned integer 0x00012300. On access, the handle would be shifted to the right by 8, thus always guaranteeing a value less than 16M. If an unused entry is accessed, the access will either crash due to the memory being inaccessible (PROT_NONE) or cause a nullptr dereference as the data is zero, which is assumed to be safe.

The fact that the table backing memory is never reallocated also simplifies the (concurrent) GC implementation, discussed next.

Garbage Collected Entries with Inline Marking Bits

When an EPT entry is no longer used, it needs to be freed so it can be reused again later. In general, there is no mechanism in V8 to execute a finalizer function (which could free any EPT entries) whenever a HeapObject is collected by the GC. As such, the table itself needs to be garbage collected. Garbage collected entries also allow sharing EPT entries between different HeapObjects if that is desired. This GC model also dictates a 1:1 or 1:N relationship between heaps and EPTs (in practice between Isolates and EPTs). There cannot be an EPT associated with more than one heap as it could not easily be garbage collected (each heap has its own GC).

The EPT GC uses a simple mark-and-sweep algorithm and requires one bit per entry to indicate whether an entry is alive (marked) or not. This marking bit is set during the marking phase of a full GC when an object with external pointers is visited. During the sweeping phase of the GC, the entire table is swept, building up a freelist of all free entries and resetting the marking bits.

While the marking bits could be stored out-of-line, in a separate buffer, the current design places the marking bits in the MSB of the pointers stored in the EPT entries. This means that:

  1. The marking bit must be removed when a pointer is loaded from an EPT
  2. The marking bit must be preserved or simply set to 1 when a pointer is stored in an EPT

Both operations are piggy-backed onto the type checking mechanism, discussed in the next section, and therefore effectively free. In particular, (1) happens as part of the bitwise AND performed for the type check while (2) happens automatically when a pointer is tagged with its type tag before storing it in an EPT.

Since the marking phase of the GC may run on a background thread, the EPT must be thread-safe. Generally, most EPT operations are simply atomic loads or stores of pointer-sized values. However, marking and allocating entries require special care:

  • When an entry is marked as alive, the old value is loaded, the MSB set, and the pointer written back. This could race with a store happening concurrently. To prevent this race, the marking must happen through CAS. If the CAS succeeds, then the value was atomically swapped. On the other hand, if it fails, then the mutator thread will have written a new pointer into the entry which is guaranteed to be marked, so no further attempts at setting the marking bit are necessary.
  • When an entry is allocated, the top entry from the freelist is returned and the 2nd entry on the freelist becomes the head. This can race with another entry allocation happening concurrently (e.g. from another background thread). This can also be solved with a CAS. However, a lock is required for the slow path when the freelist is empty and the table needs to be grown in place.

The sweeping phase of the GC should always run on the main thread. This avoids potential UAF issues where an external object is resolved to a raw pointer on one thread, but then freed through a table GC on another thread. This, however, also requires that external objects only be accessed on the main thread.

Finally, it is important to note that in order to guarantee that every entry in an EPT always points to a valid object, it is required that the entries be invalidated the moment the external object is deallocated. This requirement suggests that the EPTs should be the source of truth for determining when an external object can be deleted (because it has no more references from V8). While it is possible to determine this independently, it risks causing a situation where an external object is freed while its entry in an EPT is still valid, and can therefore be abused by an attacker to cause a use-after-free.

AND-based Type Checks with Inline Type Tags

One of the preconditions for achieving memory safety is that every access to an external pointer must perform a type check to protect against type confusion attacks. Without such a check, in the example given above an attacker would be able to swap the external pointer handle (0x00012300) with another valid handle pointing to an object of a different type. Using the external pointer would then result in a type confusion outside the sandbox (in the above example, in Blink), and therefore allow an attacker to break out of the sandbox.

The basic idea of AND-based type checking is to encode types as bitsets, where every bitset has the same number of bits set. It is then enough to perform a single bitwise AND with the inverted type as bitmask to check for the expected type. As an example, consider the case of four available type bits. This allows encoding six different types:

0011

0101

0110

1001

1010

1100

To verify that an object has the expected type, the pointer is AND-ed with the inverted type tag bits. For example, for the first tag (0011), this would mean ANDing with 1100. It is then guaranteed that for every other type, the result of that AND would be a nonzero value, and as such result in a non-canonical pointer. This in turn will lead to a crash when the pointer is accessed.

With N available bits, this construction allows for a maximum of  (“N choose N/2”) different types. With 16 reserved top bits, and one bit used for the GC marking bit, it would allow for  different types. If only 8 bits are available (e.g. due to TBI, see below), this construction allows for  different types.

AND-based type checks are particularly efficient in this case, as they allow type checking and the removal of the marking bit in a single, cheap operation.

It should be noted that this construction assumes that setting the top bits of a pointer to a nonzero value causes the pointer to become invalid. It therefore requires special care when interoperating with technologies like 5-level paging on x64 or top-byte ignore (TBI) on Arm64. In particular, TBI, which appears to be enabled by default on recent OSes, causes the bits [56, 63] to be unusable for AND-based checks as they will be ignored when the pointer is used (and thereby allow a type-check failure to go undetected). 5-level paging on the other hand may cause bits [48, 55] to become unusable should the hosting application opt-in to the larger virtual address space. Together these two technologies would prevent simple inline type tags from working at all. As such, there should be room for future adjustments to the scheme to be able to deal with more different types and less available type-tag bits. Broadly, the following options currently exist:

  1. Storing type information in an additional EPT entry, either preceding or following the pointer entry (in that case the type information must itself “look like” a pointer and be tagged). This would require “fat” table entries, consisting of two pointer-sized fields.
  2. Use additional pointer bits, for example bits [40-47] to store type information. This is only possible if the Embedder guarantees that all external objects are allocated in a subregion of the virtual address space so that the external objects can be referenced through offsets instead of pointers.
  3. Allocating additional tables and selecting the table based on (some part of) the type tag.
  4. Use Pointer Authentication (PAC) on Arm64 to implement the type checking mechanism instead of AND-based type tags. This will, however, affect the implementation of the garbage collection as using an inline marking bit would no longer be easily possible.

All options should be feasible for the Chromium use case and will be explored further in the future. The concrete type checking implementation could then become a compile-time option.

Entry Ownership

There are two broad options for the ownership of table entries: shared entries and owned entries. In the first model, table entries can be shared between multiple HeapObjects and are allocated whenever an external pointer field of a HeapObject is updated (the entries must generally be read-only when they are shared). In the second mode, a HeapObject that can store an external pointer allocates the EPT entries when it is created and keeps them throughout its lifetime.

Both approaches have pros and cons. A drawback of shared entries is that it would cause writes to external pointer fields to become unpredictable both from a performance and a security point of view as they might require garbage collection to run (if there are no free entries in the external pointer table). This might cause janky performance, but could also lead to security issues if it is not modeled correctly by e.g. the JIT compiler. On the other hand, if many external pointer fields are never written to (essentially nullptr), then allocating entries for them would be wasteful.

The proposed design supports both ownership modes, but in practice owned entries are used to avoid the issues with shared entries (in particular, unexpected garbage collection). However, there is a reserved “null entry” (at index zero) which always contains nullptr. This entry can be used in cases where it is more efficient to allocate EPT entries lazily.

Summary: How Memory Safety is Achieved

This section briefly summarizes how the proposed design fulfills the four requirements for memory safety listed earlier.

1. No OOB access into an EPT

Out-of-bounds access into the table is prevented by reserving a fixed-size (e.g. 128MB) virtual memory block for every table during initialization and using left-shifted indices (by 8 for a 128MB table) to guarantee that the index (after unshifting) will always be less than the number of entries in the table.

2. EPT Entries are always “Valid”

As any entries in an EPT can be referenced by an attacker, every entry must either be a pointer to a live object or a value that is guaranteed to crash if dereferenced. This is achieved by garbage collecting the table and ensuring that any table entry is invalidated once the pointed-to object is deallocated.

3. External Pointer Type Checking

As handles can be arbitrarily manipulated by an attacker, every access to an external pointer must perform a type check. This is achieved through the inline, AND-based type checks where the unique type tag is encoded into the unused top bits of each pointer stored in an EPT. A failed type check guarantees that the resulting pointer will be invalid and cause a crash when dereferenced.

4. Thread-safe access to External Objects

Threads cannot gain access to external objects owned by other threads as the pointers are kept on another Isolate’s EPT. External pointer handles thus only have meaning in combination with an active Isolate. This ensures safe access to external objects that are not by themselves thread-safe.

The EPT itself must be thread-safe as it may be accessed from background threads. As such, every access (loading, storing, swapping, …) of an entry must be performed atomically.

5. External Object Substitutability

The above properties only ensure that an access into an EPT yields a valid pointer to a live object of the expected type. It is still required that such objects be substitutable for each other (an attacker can swap handles to external objects of the same type arbitrarily!). This property ultimately has to be ensured by the application. In practice, this means that for any group of related external objects, there should generally only be a single reference that “crosses” the sandbox boundary so that these relationships cannot be broken by an attacker.

As a final note, special care must be taken when operating on external objects after resolving them to a “real” pointer through the EPT. In particular, any data related to an external object that is stored inside the sandbox must be validated before being used. For example, if indices into an external array are kept inside the sandbox, they must be validated (e.g. through a CHECK) before being used to access the array.

Example

An example for an external pointer table is shown below. Here, the following type tags are used:

kFreeEntryTag         = 0x7f80000000000000 (tag 0b0111111110000000)

kExternalObj1Tag      = 0x807f000000000000 (tag 0b1000000001111111)

kExternalObj2Tag      = 0x80bf000000000000 (tag 0b1000000010111111)

kExternalObj3Tag      = 0x80df000000000000 (tag 0b1000000011011111)

Note that each tag except for the free entry tag has the mark bit (the MSB) set so that every store of an external pointer into an EPT automatically marks the entry as alive.

I

m ttttttttttttttt ppp ... ppp (description)

0

0 000000000000000 000 ... 000 (reserved null entry)

1

0 111111110000000 000 ... 101 (index of next free entry: 5)

2

1 000000001111111 xxx ... xxx (pointer to an ExternalObj1)

3

1 000000010111111 xxx ... xxx (pointer to an ExternalObj2)

4

0 000000011011111 xxx ... xxx (pointer to an ExternalObj3)

5

0 111111110000000 000 ... 111 (index of next free entry: 7)

6

1 000000011011111 xxx ... xxx (pointer to an ExternalObj3)

I = index

m = marking bit

t = type tag bit

p = payload bit

In this example, the table contains:

  • The empty null entry at index 0
  • A free entry at index 1
  • A (marked) entry to an ExternalObj1 at index 2
  • A (marked) entry to an ExternalObj2 at index 3
  • A (unmarked) entry to an ExternalObj3 at index 4
  • Another free entry at index 5
  • A (marked) entry to an ExternalObj3 at index 6

When an external object of type ExternalObj2 is now accessed, V8 will load the external pointer handle (e.g. 0x00000300 for the 4th entry) from the HeapObject, then load the 64 bit value from the table at the index specified by the handle. It will then AND the bits with the mask 0x7f40ffffffffffff (the inverse of kExternalObj2Tag) to remove the mark bit and type tag but leave all other bits in place. If the entry in the table is not a pointer to an ExternalObj2, this operation is guaranteed to cause other type bits to be one, thus rendering the pointer inaccessible.

Similarly, when V8 stores a pointer to a ExternalObj2 into a HeapObject, it will first OR the pointer with 0x80bf000000000000 to set the mark bit and the type tag at the same time, then write the value into the active EPT and write the handle to that entry into the HeapObject.