1 of 17

Incoherent Data and Instruction Caches are the Original DEP

CSAW 2014 Workshop Series

Dino A. Dai Zovi

InfoSec Technical Lead @ Square

ddz@squareup.com / ddz@theta44.org

@dinodaizovi

2 of 17

C.R.E.A.M.

Caches Ruin Exploits Around Me

3 of 17

SIGILL?!?

Program received signal SIGILL, Illegal instruction.

0xbeffa890 in ?? ()

(gdb) x/i 0xbeffa890

0xbeffa890: add r6, pc, #1

Looks fine to me (and GDB)...

4 of 17

CPU Instruction Cycle

Where can exceptions occur and result in:

  • SIGBUS?
  • SIGSEGV?
  • SIGILL?

"Comp fetch execute cycle" by Ratbum - Own work. Licensed under Creative Commons Attribution 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Comp_fetch_execute_cycle.png#mediaviewer/File:Comp_fetch_execute_cycle.png

5 of 17

Example: BeagleBone Black

6 of 17

AM335x Technical Reference Manual

Caches, how do they work?�What does all of that actually mean?!?

7 of 17

Memory Hierarchy

http://www.edwardbosworth.com/CPSC2105/Lectures/Slides_06/Chapter_07/Pentium_Architecture_files/image004.gif

8 of 17

Separate vs. Unified/Integrated

Separate caches:

  • Instruction cache only used for i-fetch
  • Data cache used for data loads and stores

�Unified (“integrated”) caches:

  • Cache lines can store either instructions or data

9 of 17

Data/Instruction Cache Incoherency

  • If you are used to exploiting x86, you’ll have never run into this
    • Intel caches use memory bus snooping to keep separate L1 data and instruction caches coherent
  • RISC-based processors typically rely on software to keep them coherent
    • Self-modifying code requires special considerations

10 of 17

16-Word Lines

  • How many bytes in a word on this CPU?
  • Each many bytes in each cache line?
  • How many lines in each 32KB L1 cache?

11 of 17

SIGILL?!?

Program received signal SIGILL, Illegal instruction.

0xbeffa890 in ?? ()

(gdb) x/i 0xbeffa890

0xbeffa890: add r6, pc, #1

Looks fine to me (and GDB)...

12 of 17

Code Injection and Separate Caches

  1. CPU writes shellcode to 0xbeffa890
    1. Writes data to L1 data cache
    2. L1 is a write-back cache, it is only written to L2 cache when L1 cache line is evicted
  2. CPU jumps to 0xbeffa890
    • Fetch instruction at 0xbeffa890 from memory
    • Not found in L1 instruction cache, so load it from L2 cache or main memory instead
  3. Memory fetch loads old data, not payload :(

13 of 17

Code Injection and Separate Caches

http://www.edwardbosworth.com/CPSC2105/Lectures/Slides_06/Chapter_07/Pentium_Architecture_files/image004.gif

Payload

14 of 17

Code Injection and Separate Caches

http://www.edwardbosworth.com/CPSC2105/Lectures/Slides_06/Chapter_07/Pentium_Architecture_files/image004.gif

Payload

15 of 17

Code Injection and Separate Caches

http://www.edwardbosworth.com/CPSC2105/Lectures/Slides_06/Chapter_07/Pentium_Architecture_files/image004.gif

Payload

16 of 17

Flushing the Data Cache

  • ARM cache flush instructions are privileged
    • System calls are usually provided to do it
  • Perform any system call (some archs/OS)
  • How else?
    • Process L1 cache size amount of new data
    • Should evict cache lines containing shellcode
  • Or, sidestep the issue and use ROP

17 of 17

Summary

Where will you run into this?

  • Embedded devices w/ RISC-based CPUs

Reliable exploitation requires deep understanding of how your target works

  • Computer architecture
  • OS, runtime (kernel, linker, heap allocators)