Operating Systems and C
Fall 2022, Security-Track�6. Stack-Based Exploits
· 1
01.10.2020
Parallel Tracks
perflab:
attacklab:
· 2
exciting part of the course!
security-track lecture nr. 1:�what you’ll need for attacklab.
performance-track lecture nr. 2:
what you’ll need for perflab.�optimizations. how to write code so compiler can derive performant code. manual transformations, blocking, loop unrolling
performance-track lecture nr. 1:
what you’ll need for perflab.�array layout, what it means for performance, cache hierarchy, how associativity is organized in the cache. (important stuff for anyone)
Outline
Brief review of assembly
Arrays, strings and structs in assembly
Structure of an .s program
Stack-based exploits and counter-measures
3
01.10.2020
basically �what you need �for the attacklab
(next week: not important for attacklab (culture-stuff) )
alignment; needed in assignment
you’ll need to disassemble a binary
X86-64 Assembly
· 4
09.09.2020
Memory
CPU
Programmer-Visible State
PC: Program counter
Register file
Condition codes
PC
Registers
Memory
Object Code
Program Data
OS Data
Addresses
Data
Instructions
Stack
Condition
Codes
instruction either
recall, what is a �stack (data structure)?�(push, pop)
von Neumann Architecture
Registers
The %rip register is the current instruction pointer. Contains address of next instruction to be executed.
There are 16 general purpose registers in x86-64. Additional registers for floating point, SIMD, …
16 registers: r0, r1, …, r15
· 5
09.09.2020
most instructions implicitly increment it. explicitly updated ⇒ change in control flow.
“register file”
aka. program counter (pc)
10s
General-Purpose Registers
For historical reasons, r0-r7 are called original registers. They have the following names:
· 6
09.09.2020
top & bottom�for a given frame
usually first parameters of functions �(if not arrays)
General-Purpose Registers
Register values can be accessed at different levels of granularity:
· 7
09.09.2020
why: e.g. �C short is 2B
memory is�byte-addressable
Instructions
Three classes of instructions:
· 8
09.09.2020
we didn’t talk much about these, except we walked about how these are built from logic gates
can loop
can if/then
AT&T syntax
The GNU tools (gcc, gdb) use AT&T Syntax for assembly.�example: movq %rsp, %rbp
Syntax is of the form
OPERATOR source, destination
Register names are prefixed with %
The alternative is the Intel syntax (on windows): MOVQ EBP, ESP – no %
Look for % in the assembly code, if they are present you are dealing with AT&T syntax
· 9
09.09.2020
never more than 2 operands.�when there are 2, this is the form.
10s
Procedure Call Example
· 10
09.09.2020
0x8048553
0x100
%rsp
%rip
%rsp
%rip
0x8048b90
0x108
0x10c
0x110
0x100
0x80854e
123
0x108
0x10c
0x110
123
0x108
call 8048b90
804854e: e8 3d 06 00 00 call 8048b90 <main>
8048553: 50 pushl %eax
return address
= address of next instruction
instruction pointer�updated to callee
top of stack
Procedure Call Example
· 11
09.09.2020
%rsp
0x100
%rsp
0x100
0x108
0x10c
0x110
0x8048553
123
0x108
0x10c
0x110
123
ret
8048591: c3 ret
0x108
0x8048553
%rip
%rip
0x8048553
0x8048591
(garbage)
just a pop
Q: what happens if user input overwrites stack pointer, or return address?�A: user has full control over control flow. (exploit)
Stack Frame
Caller:
Callee:
· 12
09.09.2020
Return Addr
Saved
Registers
+
Local
Variables
Argument
Build
Old %rbp
Arguments
Caller
Frame
Frame pointer�%rbp
Stack pointer
%rsp
stack frame
(1) : to restore caller state on return� (bottom frame has no caller).
(2) : pushes it when it calls� (top frame does not need this)
extremely important for assignment; you’ll spend hours seeing what stack looks like.
Outline
Brief review of assembly
Arrays, strings and structs in assembly
Structure of an .s program
Stack-based exploits and counter-measures
13
01.10.2020
Array Allocation
Basic Principle
T A[L];
A is an Array of data type T and length L
Contiguously allocated region of L * sizeof(T) bytes
· 14
01.10.2020
compiler is going to reserve space for the array.
examples follow
Array Allocation
· 15
01.10.2020
char string[12];
x
x + 12
int val[5];
x
x + 4
x + 8
x + 12
x + 16
x + 20
double a[3];
x + 24
x
x + 8
x + 16
char *p[3];
x
x + 8
x + 16
x + 24
8 bytes = 64 bits�(address size =� word size)
Array Access
int A[5] = {0, 1, 2, 3, 4};
Array of data type int and length 5
Identifier A can be used as a pointer to array element 0: Type int*
Reference Type Value
val[4] int 4
val int * x
val+1 int * x + 4
&val[2] int * x + 8
val[5] int ??
*(val+1) int 1
val + i int * x + 4 i
· 16
01.10.2020
val from �previous slide
undefined�behavior
Array Example
· 17
#define ZLEN 5
typedef int zip_dig[ZLEN];
zip_dig cmu = { 1, 5, 2, 1, 3 };
zip_dig mit = { 0, 2, 1, 3, 9 };
zip_dig ucb = { 9, 4, 7, 2, 0 };
zip_dig cmu;
1
5
2
1
3
16
20
24
28
32
36
zip_dig mit;
0
2
1
3
9
36
40
44
48
52
56
zip_dig ucb;
9
4
7
2
0
56
60
64
68
72
76
Declaration “zip_dig cmu” equivalent to “int cmu[5]”
Example arrays were allocated in successive 20 byte blocks
Not guaranteed to happen in general
(ucb = berkeley)
type alias
20 = 5*4
ARRAY LOOP EXAMPLE (IA32)
# rdx = z
movq $0, %rax # %rax = i
.L4: # loop:
addq $1, (%rdx,%rax,4) # z[i]++
addq $1, %rax # i++
cmpl $5, %rax # i:ZLEN (ZLEN==5)
jne .L4 # if !=, goto loop
void zincr(zip_dig z) {
int i;
for (i = 0; i < ZLEN; i++)
z[i]++;
}
Array Example
$1 is �constant 1
%r is �register r
addressing mechanism:�z + 4*i
Strings
Declared as read-only data.
.section .rodata
.Label:
.string ”String constant\n”
19
1 byte per character,�terminated by \0
we won’t spend more time on strings, because you won’t need them in the assignment.
Structures & Alignment
Unaligned Data
Aligned Data
Primitive data type requires K bytes ⇒
Its address must be multiple of K
c
i[0]
i[1]
v
3 bytes
4 bytes
p+0
p+4
p+8
p+16
p+24
Multiple of 4
Multiple of 8
Multiple of 8
Multiple of 8
c
i[0]
i[1]
v
p
p+1
p+5
p+9
p+17
struct S1 {
char c;
int i[2];
double v;
} *p;
starting address must be multiple of K (1, 2, 4, 8, 16, …)
not a power of 2.�moving data in memory,�and to/from register,�is a mess. instead:
gcc stack is 16-byte aligned.�remember that.
whole data structure is padded up to multiple of its largest K
Specific Cases of Alignment (x86-64)
no restrictions on address
lowest 1 bit of address must be 02
lowest 2 bits of address must be 002
lowest 3 bits of address must be 0002
Q: but Willard, that’s wasteful.�A: when we talk dynamic memory allocation (heap), we’ll see these bits used for something else (tags).
Meeting Overall Alignment Requirement
For largest alignment requirement K
Overall structure must be multiple of K
struct S2 {
double v;
int i[2];
char c;
} *p;
| v | i[0] | i[1] | c | 7 bytes | | ||||||||||||||||||
p+0 | | | | | | p+8 | | | | | | | p+16 | | | | | | | p+24 |
order in descending order of size makes it easier for compiler to store your struct more compactly.
Outline
Brief review (x86-64 + ex 3.67)
Arrays, strings and structs in assembly
Structure of an .s program
Stack-based exploits and counter-measures
23
01.10.2020
what transformation of my code does my compiler do?�that’s why we want to be able to read assembly.
Virtual Memory
· 24
01.10.2020
Kernel virtual memory
Memory mapped region for
shared libraries
Run-time heap
(created by malloc)
User stack
(created at runtime)
0
Memory
invisible to
user code
Read/write data
Read-only code and data
Loaded from the
hello executable file
printf function
Program
start
.bss:
.data:
.rodata:
.text:
(uninitialised)
(initialised � read-write)
(initialised read-only)
defined, but not �initialized
you have � different sections �in your assembly program, �which corresponds to � different regions �in virtual memory.
4 sections
(program)
Object file
· 25
01.10.2020
-h (headers) reveals structure of object file�(the sections, their size, etc.).
Assembly file
· 26
01.10.2020
a function
a function
read-only data
program
read-only data
program
Object file
· 27
01.10.2020
disassemble all sections, �not just those containing instructions
reveals the actual �address of each instruction
last 2 phases of�assignment: need�to use this to find�patterns in the code
Outline
Brief review (x86-64 + ex 3.67)
Arrays, strings and structs in assembly
Structure of an .s program
Stack-based exploits and counter-measures
28
01.10.2020
Vulnerable C code
· 29
01.10.2020
example of vulnerable C code
disable stack protection�(more on that later today)�(so it’s not really a problem today; compilers prevent problem by default)
process tried accessing something outside itself.�what’s happening? � let’s investigate.
Vulnerable C code
· 30
01.10.2020
The stack is 16B�aligned
=>
rsp is decreased �with 16B�sub $0x10, %rsp
push old base pointer �on stack
new base pointer := �old stack pointer
new stack pointer := �old stack pointer - 16
enough space for buf
tip: for assignment, always use objdump. with it, you get�assembly, and the object, and the address where it is
= %rsp
address of buff into rax
then into 1st arg
initialize return
evict stack frame
Vulnerable code
0x7fffffffe490
0x7fffffffe40A
Stack Frame
for main
Contents of gets
Before call to gets
0x7fffffffe40A
0x40059c
0x7fffffffe480
gets buffer�8B aligned
return�instruction
so, why vulnerable?
%rbp+1 is return instruction
if I write more than 16 bytes,�then I overwrite return address.�I take over the machine then�(or provoke segmentation fault)
Buffer Overflow Stack Example
0x7fffffffe490
0x7fffffffe40A
Stack Frame
for main
Contents of gets
Before call to gets
0x7fffffffe40A
0x40059c
0x7fffffffe480
depending on input,�different termination behavior.
Malicious Use of Buffer Overflow
Input string contains byte representation of executable code
Overwrite return address with address of exploit code
When bar() executes ret, will jump to exploit code
int bar() {
char buf[64];
gets(buf);
...
return ...;
}
void foo(){
bar();
...
}
Stack after call to gets()
B
return
address
A
foo stack frame
bar stack frame
B
exploit
code
pad
data written
by gets()
payload:� exploit code (backwards) ++ pad ++ B
Exploits Based on Buffer Overflows
Buffer overflow bugs allow remote machines to execute arbitrary code on victim machines
Internet worm
Early versions of the finger server (fingerd) used gets() to read the argument sent by the client:
Worm attacked fingerd server by sending phony argument:
Morris worm, 1988
how to stop this:
Avoiding Overflow Vulnerability
Use library routines that limit string lengths
fgets instead of gets
strncpy instead of strcpy
Don’t use scanf with %s conversion specification
/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� fgets(buf, 4, stdin);� puts(buf);�}
use a function that checks how much you read.
System-Level Protections
Randomized stack offsets
At start of program, allocate random amount of space on stack
Makes it difficult for hacker to predict beginning of inserted code
Nonexecutable code segments
In traditional x86, can mark region of memory as either “read-only” or “writeable”
X86-64 added explicit “execute” permission
Stack Canaries
Idea
Place special value (“canary”) on stack just beyond buffer
Check for corruption before exiting function
GCC Implementation
-fstack-protector
-fstack-protector-all (default)
Setting Up Canary
echo:
. . .
mov %fs:20, %rax # Get canary
mov %rax, -8(%rbp) # Put on stack
xor %eax, %eax # Erase canary
. . .
/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� gets(buf);� puts(buf);�}
Return Address
%rbp
Stack Frame
for main
Content of gets
Before call to gets
Canary
Saved %rbp
random part of memory into rax
put canary just below base pointer
must overwrite canary to overwrite return address
Checking Canary
echo:
. . .
mov -8(%rbp), %rax # Retrieve from stack
xor %fs:20, %rax # Compare with Canary
je .L24 # Same: skip ahead
call __stack_chk_fail # ERROR
.L24:
. . .
/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� gets(buf);� puts(buf);�}
Return Address
%rbp
Stack Frame
for main
Content of gets
Before call to gets
Canary
Saved %rbp
%rbp-0x8
%rbp-0x10
Canary �Example
Canary �Example
when compiled w/ protection
GCC protections
Compile-time
Run-time
pretty clear that you shouldn’t use gets.
Willard Rafnsson�IT University of Copenhagen
wilr@itu.dk�https://www.willardthor.com/
goal: tools that developers can use to write secure SW.
sample research (past supervisions):
I like code, and I like proofs.
I created the “Applied Information Security” course.
I’m a barista in Analog.
formal�methods
programming�languages
computer�security
my�research
shameless self-promotion
Take-Aways
Alignment is necessary when working with structs
Object files are sequences of hex codes, that can be mapped to assembly instructions
Buffer overflows cause vulnerabilities that can be exploited maliciously
Counter measures include (i) randomized stack addresses, (ii) non executable code on stack and (iii) canaries.
· 44
01.10.2020