1 of 10

Hostile Multi-Tenancy on Commodity GPUs: Can It Be Secure?

Demi Marie Obenour

Invisible Things Lab

2 of 10

Current GPU virtualization options

  • Large attack surface (VT-G, VirGL/VirtIO)
  • Only one VM at a time (PCIe pass-through without SR-IOV)
  • Requires high end hardware (PCIe pass-through with SR-IOV)
  • Proprietary (not suitable for FOSS, including Qubes OS)
  • Some combination of the above

3 of 10

Requirements

  • Provide support for games and other GPU-intensive software
  • Security: Must be secure enough to turn on by default
  • Usability: Must be an improvement over the current situation
  • Compatibility: Must work on consumer GPUs
  • No existing solution meets these requirements!

4 of 10

Proposal: Minimal userspace driver

  • Minimal: keep the amount of privileged code to a minimum
  • Userspace: kernel-independent iteration, no reboots on update, more robust to faults.
  • Well-defined API: documented, transport-independent specification separate from the source code
  • Written in modern, memory-safe language: avoid memory corruption vulnerabilities, and provide safe abstractions
  • Capability-based access control: used by modern systems such as seL4

5 of 10

Why minimal?

  • Modern GPUs have full support for virtual memory
  • GPU code is restricted by page tables, just like CPU code!
  • (Hopefully) no need for complex shader static analysis.
  • If shader static analysis is needed, it ought to be as simple as possible.

6 of 10

Why userspace?

  • We can use modern programming languages, such as Rust.
  • We can use the full standard library.
  • We can develop faster. Restarting the driver does not require a reboot.
  • We can sandbox the driver (via an IOMMU) so it cannot (directly) compromise the rest of the system.
  • We can even embed the driver into a userspace program, such as a Wayland compositor.

7 of 10

Why a well-defined API?

  • Specification separate from the source code
  • Enables proper conformance testing
  • Less likely to have bugs
  • Transport-independent: clients can communicate using AF_UNIX, AF_VSOCK, vchans, seL4 IPC, or even a network!
  • Easier to fuzz

8 of 10

Why memory-safe

Because memory unsafety is a huge source of security holes!

  • Already very well-known problem, will not go into detail here.
  • https://alexgaynor.net/2017/nov/20/a-vulnerability-by-any-other-name/

9 of 10

Why capabilities?

  • A capability both designates a resource and authorizes access to it
  • A resource can only be accessed if an appropriate capability is held.
  • Used by many other systems, such as seL4, Capsicum, and the WebAssembly System Interface (WASI)
  • Avoids all sorts of confused-deputy bugs

10 of 10

Let’s write this!

  • GPU multi-tenancy should be readily available
  • Giving VMs access to hardware-accelerated graphics should be safe.
  • One should not need a completely separate machine for gaming.
  • And it is up to us to make this happen.