1 of 47

Syzkaller: coverage-guided fuzzer for the Linux kernel

Andrey Konovalov <andreyknvl@google.com>

DC4822 Meetup

June 23rd 2018

2 of 47

Whoami

  • Andrey Konovalov <andreyknvl@google.com>
  • Software engineer at Google based in Munich
  • Work on the Linux kernel bug finding tools
  • DC7499

  • https://twitter.com/andreyknvl
  • https://github.com/xairy
  • https://xairy.github.io/

3 of 47

Dynamic tools: userspace

  • AddressSanitizer, ThreadSanitizer, MemorySanitizer, ...
    • C/C++/Go bug detectors
    • Detect wide range of bug types (use-after-frees and out-of-bounds, data races and deadlocks, uninitialized memory uses, undefined behaviors)
    • Built into Clang, GCC and Go compilers

  • libFuzzer
    • In-process coverage-guided fuzzing engine for C/C++

  • oss-fuzz
    • Continuous fuzzing service for open source software

4 of 47

Agenda

  • Introduction to dynamic analysis and fuzzing
  • Linux kernel overview
  • Kernel bug detectors: Sanitizers
  • Kernel fuzzing: syzkaller and syzbot

5 of 47

Introduction

6 of 47

Static and dynamic analysis

  • Static analysis "is the analysis of computer software that is performed without actually executing programs". Essentially reading the source code (or the disassembly) manually or processing it using some tools.

  • Dynamic analysis "is the testing and evaluation of an application during runtime". Essentially running the program and checking whether it crashes or inspecting it during runtime.

7 of 47

Dynamic analysis: why?

  • Context:
    • Huge codebases
    • Limited human resources

  • Advantages:
    • No false positives
    • Undeniable proofs
    • Works on any code (pointers, unions, arrays, indirect calls, etc.)

  • Downsides:
    • See next slide...

8 of 47

Dynamic analysis

To successfully perform dynamic analysis we need:

  1. A way to detect bugs when they happen
  2. Some bugs don't manifest in any obvious way (e.g. OOB read)
  3. Examples of bug detectors: valgrind, ASan

  • A way to execute different parts of the program
  • The more code we cover, the more the chance to hit the code that contains a bug
  • Example approaches: run unit / regression / … tests (if they exist), fuzzing

  • (A way to run and debug the application)

9 of 47

Fuzzing

  • Idea: let's feed randomly generated inputs to our program

  • Dumb approach: randomly generate binary blobs

  • Smarter approaches:
    • Grammar-based fuzzing: generate structured inputs (assuming we know the structure of the inputs) e.g. based on grammar/templates
    • Coveraged-guided fuzzing: trace the execution of the program on each of the tested inputs and save the ones that trigger new paths within the program to corpus
    • Mutation-based fuzzing: generate new inputs by modifying (mutating) existing ones (assuming we have a corpus of inputs)

10 of 47

Grammar-based fuzzing

Fuzzer uses explicit input grammar to generate complex realistic inputs:

HTTP-message = Simple-Request | Simple-Response | Full-Request | Full-Response

Simple-Request = "GET" SP Request-URI CRLF

Full-Request = Request-Line *( General-Header | Request-Header | Entity-Header ) CRLF

[ Entity-Body ]

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

Method = "OPTIONS" | "GET" | "HEAD" | "POST" | "PUT" | "PATCH" | "COPY" | "MOVE" |

"DELETE" | "LINK" | "UNLINK" | "TRACE" | "WRAPPED" | extension-method

Request-URI = "*" | absoluteURI | abs_path

absoluteURI = scheme ":" *( uchar | reserved )

HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT

http_URL = "http:" "//" host [ ":" port ] [ abs_path ]

...

11 of 47

Coverage-guided fuzzing

start with some initial corpus of inputs (potentially empty)

for (;;) {

choose an input form the corpus and mutate it

execute and collect code coverage

if (input gives new coverage)

add it to corpus

}

Advantages:

  • Turns exponential problem into linear (more or less)
  • Inputs are reproducers
  • Corpus is suitable for regression testing

12 of 47

Target:

Linux kernel

13 of 47

Linux kernel

  • 20 million lines of code
  • Written in an unsafe language: C
  • Almost no unit/regression tests
    • "Regression testing"? What's that? If it compiles, it is good; if it boots up, it is perfect. - Linus Torvalds
  • A lot of undocumented interfaces

14 of 47

Kernel inputs

Kernel

Userspace

syscalls

network

...

USB

15 of 47

Linux kernel fuzzing

  • Goal: grammar-based coverage-guided fuzzing
    • Need a way to collect coverage
    • Need descriptions of all syscalls

  • Need bug detectors

16 of 47

Bug detectors:

Sanitizers

17 of 47

Kernel Sanitizers

  • KASAN (use-after-frees and out-of-bounds)
    • CONFIG_KASAN available upstream since 4.0
  • KTSAN (data-races and deadlocks)
    • prototype available at https://github.com/google/ktsan
  • KMSAN (uninitialized-memory-use)
    • prototype available at https://github.com/google/kmsan
  • KHWASAN (similar to KASAN, requires less memory)

18 of 47

KASAN report

BUG: KASAN: use-after-free in sctp_id2assoc+0x3a3/0x3b0 net/sctp/socket.c:224

Read of size 8 at addr ffff8800385b6838 by task syz-executor2/5889

Call Trace:

sctp_id2assoc+0x3a3/0x3b0 net/sctp/socket.c:224

sctp_setsockopt_rtoinfo net/sctp/socket.c:2931 [inline]

sctp_setsockopt+0x4b7/0x60f0 net/sctp/socket.c:4018

sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2850

SYSC_setsockopt net/socket.c:1798 [inline]

SyS_setsockopt+0x270/0x3a0 net/socket.c:1777

entry_SYSCALL_64_fastpath+0x1f/0xbe

19 of 47

KASAN report

Allocated by task 5841:

kzalloc include/linux/slab.h:663 [inline]

sctp_association_new+0x114/0x2180 net/sctp/associola.c:309

...

SyS_connect+0x24/0x30 net/socket.c:1569

entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 5882:

kfree+0xe8/0x2b0 mm/slub.c:3882

sctp_association_destroy net/sctp/associola.c:435 [inline]

...

syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263

entry_SYSCALL_64_fastpath+0xbc/0xbe

20 of 47

Fuzzer:

syzkaller

21 of 47

Existing system call fuzzers

Trinity (and others) in essence:

while (true) syscall(rand(), rand(), rand());

Knows argument types, so more like:

while (true) syscall(rand(), rand_fd(), rand_addr());

  • Tend to find shallow bugs
  • Frequently no reproducers

22 of 47

syzkaller

  • Coverage-guided
  • Grammar-based
  • Unsupervised
  • Multi-
    • OS (Linux, *BSD, Fuchsia, …)
    • arch (x86-64, arm64, …)
    • machine (QEMU, Android phones, ...)

  • Can generate C reproducers for found bugs

23 of 47

syzkaller

24 of 47

Coverage for the Linux kernel

  • Available upstream with CONFIG_KCOV
  • GCC pass that inserts a function call into every basic block
  • kernel debugfs extension that collects and exposes coverage per-thread

__fuzz_coverage(); // 1

if (...) {

__fuzz_coverage(); // 2

...

}

__fuzz_coverage(); // 3

if (...) {

...

}

25 of 47

Syscall descriptions

Declarative description of all syscalls:

open(file filename, flags flags[open_flags],

mode flags[open_mode]) fd

read(fd fd, buf buffer[out], count len[buf])

close(fd fd)

open_flags = O_RDONLY, O_WRONLY, O_RDWR, O_APPEND ...

26 of 47

Programs

The description allows to generate and mutate "programs" in the following form:

mmap(&(0x7f0000000000), (0x1000), 0x3, 0x32, -1, 0)

r0 = open(&(0x7f0000000000)="./file0", 0x3, 0x9)

read(r0, &(0x7f0000000000), 42)

close(r0)

27 of 47

Algorithm

  1. Start with an empty corpus of programs

  • Generate a new program, or choose an existing program from corpus and mutate it

  • Run the program, collect coverage

  • If new code is covered, minimize the program and add to the corpus

  • Goto 1

28 of 47

External kernel inputs

Kernel

Userspace

syscalls

network

...

USB

TUN/TAP

Gadget API

29 of 47

Test machines

Currently supports:

  • QEMU
  • GCE (Google Compute Engine)
  • Android devices (with serial cable/Suzy-Q)
  • ODROID C2 boards
  • Standalone machines (no reboot)

The list is extensible, to support a new type need to:

  • Reboot or recreate the machine (to repair after a crash)
  • Copy and run binaries (think of scp/ssh)
  • Get console output (to grep for crashes)

30 of 47

Demo

31 of 47

syzkaller:

automation

32 of 47

Ideally...

FUZZER

Resources

PoCs

33 of 47

Reality

Operation of a typical kernel fuzzer:

  • Manually create a bunch of VMs
  • Manually copy and start the binary
  • Manually monitor console output
  • Manually deduplicate crashes
  • Manually localize and reproduce
  • Manually restart the crashed VM (or press power button!)

Works only if you want to find a handful of crashes. Otherwise - full time job.

34 of 47

syzkaller operation

syz-manager

Resources

PoCs

test machines

corpus/programs

console output

  • manages test machines
  • monitors console output
  • deduplicates crashes
  • localizes bugs
  • creates PoCs

*but may incur higher up-front setup cost

35 of 47

Further automation

  • syz-ci
    • Continuous kernel/syzkaller updates
    • Restarts syz-manager on new kernel/syzkaller

  • syz-hub
    • Input/reproducer exchange between several syz-manager's

  • syzbot
    • Bug aggregation
    • Reporting
    • Status tracking

36 of 47

Demo

37 of 47

Exploitation

  • The exploits that I wrote:

  • Some materials regarding Linux kernel exploitation:

38 of 47

Contribution

  • Contribution to syzkaller
    • Syscall descriptions
    • Fixing bugs reported by syzbot

  • Other fun stuff to do
    • Find some exploitable bug and write an exploit

39 of 47

Questions?

Andrey Konovalov <andreyknvl@google.com>

syzkaller@googlegroups.com

40 of 47

Backup

41 of 47

Other Linux kernel fuzzers

42 of 47

Process structure

43 of 47

Syscall descriptions:

discriminated syscalls

socket$inet_tcp(

domain const[AF_INET],

type const[SOCK_STREAM],

proto const[0]) sock_tcp

socket$packet(

domain const[AF_PACKET],

type flags[packet_socket_type],

proto const[ETH_P_ALL_BE]) sock_packet

44 of 47

Syscall descriptions:

struct layout

sockaddr_ll {

sll_family const[AF_PACKET, int16]

sll_protocol flags[packet_protocols, int16be]

sll_ifindex ifindex

sll_hatype const[ARPHRD_ETHER, int16]

sll_pkttype int8

sll_halen const[6, int8]

sll_addr mac_addr

pad array[const[0, int8], 2]

}

bind$packet(fd sock_packet, addr ptr[in, sockaddr_ll], addrlen len[addr])

45 of 47

Syscall descriptions:

resources

resource sock[fd]

socket(domain flags[socket_domain], type flags[socket_type],

proto int8) sock

bind(fd sock, addr ptr[in, sockaddr_storage], addrlen len[addr])

resource sock_packet[sock]

socket$packet(domain const[AF_PACKET], type flags[packet_socket_type], proto const[ETH_P_ALL_BE]) sock_packet

bind$packet(fd sock_packet, addr ptr[in, sockaddr_ll], addrlen len[addr])

46 of 47

External Stimulus

Systems calls and external stimulus in the same program:

listen(r0)

emit_ethernet(syn)

emit_ethernet(ack)

r1 = accept(r0)

emit_ethernet(data)

read(r1)

emit_ethernet(rst)

Work in progress; also applicable to USB, ...

47 of 47

Finding 0days in the Linux kernel

  • Setup syzkaller

  • Write syscall descriptions for some previously untested kernel interface

  • Specify syscalls required to interact with this interface in the configuration file

  • Run syzkaller and discover bugs