1 of 24

2 of 24

Angr & Pwn

by malweisse

vulnerability detection using symbolic execution

3 of 24

About me

Andrea Fioraldi [ @andreafioraldi, @malweisse on IRC ]

22 years old, student of Engineering in Computer Science.

Interested in binary analysis (symbolic execution, reversing and other pretty stuffs) and binary exploitation (do u know what is double free?).

Capturing flags with TheRomanXpl0it and mHACKeroni.

Not so much skilled trumpeter, mountain bike and trekking lover, Dragon Ball fanboy, homebrewer.

4 of 24

Symbolic execution

Angr crash course

Angr and the bugs

5 of 24

Concrete execution

  • Single flow of instructions (path) at time
  • Can’t evaluate program behaviour

6 of 24

Symbolic execution

  • Explore all possible paths in a program
  • Evaluate how inputs affect the choice of a path
  • Use symbolic values as inputs
  • Logical expression in function of the symbolic inputs (path constraints and symbolic storage)
  • Execution forked on branches
  • SMT solver to evaluate that expressions

7 of 24

Symbolic execution

  • [Path #1] Formula: X + Y = 42,

Storage: c = 77

  • [Path #2] Formula: ¬(X + Y = 42),

Storage: c = 77

Target: line 12

8 of 24

Symbolic execution

  • [Path #1.1] Formula: X + Y = 42 ∧ c = 38, Storage: c = 77 − Y
  • [Path #1.2] Formula: X + Y = 42 ∧ c != 38, Storage: c = 77 − Y
  • [Path #2.1] Formula: ¬(X + Y = 42) ∧ c = 38, Storage: c = X − 77
  • [Path #2.2] Formula: ¬(X + Y = 42) ∧ c != 38, Storage: c = X − 77

The interesting paths are 1.1 and 2.1 because they reached our target

9 of 24

Symbolic execution

  • [Path #1.1] Formula: X + Y = 42 ∧ c = 38, Storage: c = 77 − Y

solve(X + Y = 42 ∧ 77 − Y = 38)

X = 3, Y = 39

10 of 24

Symbolic execution (what about?)

  • Symbolic pointers dereference?
  • Loops?
  • Exponential increase of the number of paths?
  • Environment interaction?
  • Non-linear constraints?

11 of 24

Symbolic execution

Angr crash course

Angr and the bugs

12 of 24

Angr WTF

Angr is a binary analysis framework.

Binary loader, emulator, symbolic executor.

http://angr.io/

python3 -m pip install angr

13 of 24

Angr modules

  • CLE, a multi-format binary loader with an intuitive API;

  • archinfo, a collection of classes that contain architecture-specific information;

  • PyVEX, a wrapper around Valgrind’s VEX IR lifter, used to make the analyses architecture-agnostic;

  • Claripy, the angr data backend, a wrapper around the Z3 solver and an interface to abstract concrete and symbolic values handling;

14 of 24

Loading a binary

import angr�# load the exampleproject = angr.Project("./foo")�

15 of 24

Simulation

# start a new SimulationManager�simgr = project.factory.simulation_manager()�# step�simgr.step()�# step until it branches�simgr.run(until=lambda sm: len(sm.active) != 1)�# check the states that are still activeprint (simgr.active)

16 of 24

States

state = simgr.active[0]��# a state has plugins, representing registers, memory, etcprint (state.regs.rax)�print (state.memory.load(state.regs.rsp, 8))�# one of the plugins represents the system stateprint (state.posix.fd)�# files are backed by a memory regionprint (state.posix.fd[0].read_data(8)) #return (value, size)��

17 of 24

Solver

state = simgr.active[0]��# SMT solverprint (state.solver)�addr = state.regs.rsp + 0x100

# each value in angr is represented as an expression tree�v = state.memory.load(addr, 8) + 0x10

print (v)

print (v.op) # __add__

print (v.args) # (other bitvector, 0x10)# add state constraints�state.add_constraints(v < 0x2a)�state.add_constraints(v > 0x1a)�# concretize a valueprint (hex(state.solver.eval(v)))

18 of 24

Simulation 2

# create a state starting on foo()�initial_state = project.factory.blank_state()�initial_state.regs.rip = 0x4005c7 # foo address

# create symbolic args�a = initial_state.solver.BVS("sym_a", 64)�b = initial_state.solver.BVS("sym_b", 64)�initial_state.regs.rdi = a�initial_state.regs.rsi = b�# start a new SimulationManager based on initial_state�simgr = project.factory.simulation_manager(initial_state)�# explore using conditions�simgr.explore(find=0x400600, avoid=0x40060e) # well done, try again print (simgr, simgr.found) # found stash# conretize found inputsprint (a.args[0], "=", simgr.found[0].solver.eval(a))�print (b.args[0], "=", simgr.found[0].solver.eval(b))

19 of 24

Default Stashes

  • active, states that are "live"
  • errored, states that errored out (actual angr bugs)
  • found, states that reached the "find" condition
  • avoided, states that hit the "avoid" condition
  • deadended, states that produced no successors
  • unconstrained, states that jump to unconstrained symbolic values

20 of 24

Symbolic execution

Angr crash course

Angr and the bugs

21 of 24

Stack Buffer Overflow

What is the purpose of a stack buffer overflow?

Control the program counter value overwriting the return address on the stack

22 of 24

Stack Buffer Overflow

DEMO

23 of 24

Thank you!

QUESTIONS?

24 of 24

References

USEFUL PAPERS:

  • R. Baldoni, E. Coppa, D. C. D’Elia, C. Demetrescu, and I. Finocchi, “A survey of symbolic execution techniques,” ACM Comput. Surv., vol. 51, no. 3, 2018
  • Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vigna, “SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis,” in IEEE Symposium on Security and Privacy, 2016.

BRIEF SUMMARY (the chapter 1 of my thesis):

MORE ABOUT ANGR