1 of 44

Operating Systems and C

Fall 2022, Security-Track6. Stack-Based Exploits

· 1

01.10.2020

2 of 44

Parallel Tracks

perflab:

  • have two matrix multiplication procedures �(rotate, smooth), have to rewrite it.
  • about optimization techniques; �blocking, loop unrolling, etc.

attacklab:

  • you have an executable, have to attack it.
  • about the stack; code injection (smashing the stack), return-oriented programming (find interesting code in other programs).

· 2

exciting part of the course!

security-track lecture nr. 1:�what you’ll need for attacklab.

performance-track lecture nr. 2:

what you’ll need for perflab.optimizations. how to write code so compiler can derive performant code. manual transformations, blocking, loop unrolling

performance-track lecture nr. 1:

what you’ll need for perflab.�array layout, what it means for performance, cache hierarchy, how associativity is organized in the cache. (important stuff for anyone)

3 of 44

Outline

Brief review of assembly

Arrays, strings and structs in assembly

Structure of an .s program

Stack-based exploits and counter-measures

3

01.10.2020

basically �what you need �for the attacklab

(next week: not important for attacklab (culture-stuff) )

alignment; needed in assignment

you’ll need to disassemble a binary

4 of 44

X86-64 Assembly

· 4

09.09.2020

Memory

      • Byte addressable array
      • Code, user data, (some) OS data
      • Includes stack used to support procedures

CPU

Programmer-Visible State

PC: Program counter

      • Address of next instruction, 8B
      • Called “RIP” (x86-64)

Register file

      • Heavily used program data
      • Each register contains 8B

Condition codes

      • Store status information about �most recent arithmetic operation
      • Used for conditional branching

PC

Registers

Memory

Object Code

Program Data

OS Data

Addresses

Data

Instructions

Stack

Condition

Codes

instruction either

  • op on registers (state), or
  • transfers data to/from mem

recall, what is a �stack (data structure)?�(push, pop)

von Neumann Architecture

5 of 44

Registers

The %rip register is the current instruction pointer. Contains address of next instruction to be executed.

There are 16 general purpose registers in x86-64. Additional registers for floating point, SIMD, …

16 registers: r0, r1, …, r15

· 5

09.09.2020

most instructions implicitly increment it. explicitly updated ⇒ change in control flow.

“register file”

aka. program counter (pc)

10s

6 of 44

General-Purpose Registers

For historical reasons, r0-r7 are called original registers. They have the following names:

  • ax: register a
  • bx: register b
  • cx: register c
  • dx: register d

  • bp: register base pointer (start of stack)
  • sp: register stack pointer (current location in stack, grow downwards)

  • si: register source index (source for data copies)
  • di: register destination index (destination for data copies)

· 6

09.09.2020

top & bottom�for a given frame

usually first parameters of functions �(if not arrays)

7 of 44

General-Purpose Registers

Register values can be accessed at different levels of granularity:

  • 8B:
    • original registers: prefix r rax, rsp, rsi
    • other registers: no suffix r8, r15
  • 4B:
    • original registers: prefix e eax, esp, esi
    • other registers: suffix d r8d, r15d
  • 2B:
    • original registers: no prefix ax, sp, si
    • other registers: suffix w r8w, r15w
  • 1B (high byte):
    • original registers (bits 8-15 from ax-dx) ah, bh, ch, dh
  • 1B (low byte):
    • original registers (bits 0-7 from ax-dx) al, bl, cl, dl
    • other registers: suffix b r8b, r15b

· 7

09.09.2020

why: e.g. �C short is 2B

memory is�byte-addressable

8 of 44

Instructions

Three classes of instructions:

  1. Transfer between memory and register
    • Load/store data: register <-> memory (e.g. mov)
    • Push/pop: register <-> stack

  • Arithmetic and comparison functions

  • Transfer control
    • Jumps to/from procedures
    • Conditional branches

· 8

09.09.2020

we didn’t talk much about these, except we walked about how these are built from logic gates

can loop

can if/then

9 of 44

AT&T syntax

The GNU tools (gcc, gdb) use AT&T Syntax for assembly.�example: movq %rsp, %rbp

Syntax is of the form

OPERATOR source, destination

Register names are prefixed with %

The alternative is the Intel syntax (on windows): MOVQ EBP, ESP – no %

Look for % in the assembly code, if they are present you are dealing with AT&T syntax

· 9

09.09.2020

never more than 2 operands.�when there are 2, this is the form.

10s

10 of 44

Procedure Call Example

· 10

09.09.2020

0x8048553

0x100

%rsp

%rip

%rsp

%rip

0x8048b90

0x108

0x10c

0x110

0x100

0x80854e

123

0x108

0x10c

0x110

123

0x108

call 8048b90

804854e: e8 3d 06 00 00 call 8048b90 <main>

8048553: 50 pushl %eax

return address

= address of next instruction

instruction pointer�updated to callee

top of stack

11 of 44

Procedure Call Example

· 11

09.09.2020

%rsp

0x100

%rsp

0x100

0x108

0x10c

0x110

0x8048553

123

0x108

0x10c

0x110

123

ret

8048591: c3 ret

0x108

0x8048553

%rip

%rip

0x8048553

0x8048591

(garbage)

just a pop

Q: what happens if user input overwrites stack pointer, or return address?�A: user has full control over control flow. (exploit)

12 of 44

Stack Frame

Caller:

  • Arguments
    • pushed by program (if needed)
  • Return address
    • pushed by call

Callee:

  • Previous frame pointer (%rbp)
  • Other callee-save registers (%rbx, %r12-15)
  • Space for local variables
  • Arguments for next function �(when about to call another function)

· 12

09.09.2020

Return Addr

Saved

Registers

+

Local

Variables

Argument

Build

Old %rbp

Arguments

Caller

Frame

Frame pointer�%rbp

Stack pointer

%rsp

stack frame

  • caller’s saved registers(1)
  • local vars
  • args
  • return address(2)

(1) : to restore caller state on return� (bottom frame has no caller).

(2) : pushes it when it calls� (top frame does not need this)

extremely important for assignment; you’ll spend hours seeing what stack looks like.

13 of 44

Outline

Brief review of assembly

Arrays, strings and structs in assembly

Structure of an .s program

Stack-based exploits and counter-measures

13

01.10.2020

14 of 44

Array Allocation

Basic Principle

T A[L];

A is an Array of data type T and length L

Contiguously allocated region of L * sizeof(T) bytes

· 14

01.10.2020

compiler is going to reserve space for the array.

examples follow

15 of 44

Array Allocation

· 15

01.10.2020

char string[12];

x

x + 12

int val[5];

x

x + 4

x + 8

x + 12

x + 16

x + 20

double a[3];

x + 24

x

x + 8

x + 16

char *p[3];

x

x + 8

x + 16

x + 24

8 bytes = 64 bits�(address size =� word size)

16 of 44

Array Access

int A[5] = {0, 1, 2, 3, 4};

Array of data type int and length 5

Identifier A can be used as a pointer to array element 0: Type int*

Reference Type Value

val[4] int 4

val int * x

val+1 int * x + 4

&val[2] int * x + 8

val[5] int ??

*(val+1) int 1

val + i int * x + 4 i

· 16

01.10.2020

val from �previous slide

undefined�behavior

17 of 44

Array Example

· 17

#define ZLEN 5

typedef int zip_dig[ZLEN];

zip_dig cmu = { 1, 5, 2, 1, 3 };

zip_dig mit = { 0, 2, 1, 3, 9 };

zip_dig ucb = { 9, 4, 7, 2, 0 };

zip_dig cmu;

1

5

2

1

3

16

20

24

28

32

36

zip_dig mit;

0

2

1

3

9

36

40

44

48

52

56

zip_dig ucb;

9

4

7

2

0

56

60

64

68

72

76

Declaration “zip_dig cmu” equivalent to “int cmu[5]

Example arrays were allocated in successive 20 byte blocks

Not guaranteed to happen in general

(ucb = berkeley)

type alias

20 = 5*4

18 of 44

ARRAY LOOP EXAMPLE (IA32)

# rdx = z

movq $0, %rax # %rax = i

.L4: # loop:

addq $1, (%rdx,%rax,4) # z[i]++

addq $1, %rax # i++

cmpl $5, %rax # i:ZLEN (ZLEN==5)

jne .L4 # if !=, goto loop

void zincr(zip_dig z) {

int i;

for (i = 0; i < ZLEN; i++)

z[i]++;

}

Array Example

$1 is �constant 1

%r is �register r

addressing mechanism:�z + 4*i

19 of 44

Strings

Declared as read-only data.

.section .rodata

.Label:

.string ”String constant\n”

19

1 byte per character,�terminated by \0

we won’t spend more time on strings, because you won’t need them in the assignment.

20 of 44

Structures & Alignment

Unaligned Data

Aligned Data

Primitive data type requires K bytes ⇒

Its address must be multiple of K

c

i[0]

i[1]

v

3 bytes

4 bytes

p+0

p+4

p+8

p+16

p+24

Multiple of 4

Multiple of 8

Multiple of 8

Multiple of 8

c

i[0]

i[1]

v

p

p+1

p+5

p+9

p+17

struct S1 {

char c;

int i[2];

double v;

} *p;

starting address must be multiple of K (1, 2, 4, 8, 16, )

not a power of 2.�moving data in memory,�and to/from register,�is a mess. instead:

gcc stack is 16-byte aligned.�remember that.

whole data structure is padded up to multiple of its largest K

21 of 44

Specific Cases of Alignment (x86-64)

  • 1 byte: char, …

no restrictions on address

  • 2 bytes: short, …

lowest 1 bit of address must be 02

  • 4 bytes: int, float, …

lowest 2 bits of address must be 002

  • 8 bytes: double, char *, …

lowest 3 bits of address must be 0002

Q: but Willard, that’s wasteful.�A: when we talk dynamic memory allocation (heap), we’ll see these bits used for something else (tags).

22 of 44

Meeting Overall Alignment Requirement

For largest alignment requirement K

Overall structure must be multiple of K

struct S2 {

double v;

int i[2];

char c;

} *p;

v

i[0]

i[1]

c

7 bytes

p+0

p+8

p+16

p+24

order in descending order of size makes it easier for compiler to store your struct more compactly.

23 of 44

Outline

Brief review (x86-64 + ex 3.67)

Arrays, strings and structs in assembly

Structure of an .s program

Stack-based exploits and counter-measures

23

01.10.2020

what transformation of my code does my compiler do?�that’s why we want to be able to read assembly.

24 of 44

Virtual Memory

· 24

01.10.2020

Kernel virtual memory

Memory mapped region for

shared libraries

Run-time heap

(created by malloc)

User stack

(created at runtime)

0

Memory

invisible to

user code

Read/write data

Read-only code and data

Loaded from the

hello executable file

printf function

Program

start

.bss:

.data:

.rodata:

.text:

(uninitialised)

(initialised � read-write)

(initialised read-only)

defined, but not �initialized

you have � different sections �in your assembly program, �which corresponds to � different regions �in virtual memory.

4 sections

(program)

25 of 44

Object file

· 25

01.10.2020

-h (headers) reveals structure of object file�(the sections, their size, etc.).

26 of 44

Assembly file

· 26

01.10.2020

a function

a function

read-only data

program

read-only data

program

27 of 44

Object file

· 27

01.10.2020

disassemble all sections, �not just those containing instructions

reveals the actual �address of each instruction

last 2 phases of�assignment: need�to use this to find�patterns in the code

28 of 44

Outline

Brief review (x86-64 + ex 3.67)

Arrays, strings and structs in assembly

Structure of an .s program

Stack-based exploits and counter-measures

28

01.10.2020

29 of 44

Vulnerable C code

· 29

01.10.2020

example of vulnerable C code

disable stack protection�(more on that later today)�(so it’s not really a problem today; compilers prevent problem by default)

process tried accessing something outside itself.�what’s happening? � let’s investigate.

30 of 44

Vulnerable C code

· 30

01.10.2020

The stack is 16B�aligned

=>

rsp is decreased �with 16B�sub $0x10, %rsp

push old base pointer �on stack

new base pointer := �old stack pointer

new stack pointer := �old stack pointer - 16

enough space for buf

tip: for assignment, always use objdump. with it, you get�assembly, and the object, and the address where it is

= %rsp

address of buff into rax

then into 1st arg

initialize return

evict stack frame

31 of 44

Vulnerable code

0x7fffffffe490

0x7fffffffe40A

Stack Frame

for main

Contents of gets

Before call to gets

0x7fffffffe40A

0x40059c

0x7fffffffe480

gets buffer�8B aligned

return�instruction

so, why vulnerable?

%rbp+1 is return instruction

if I write more than 16 bytes,�then I overwrite return address.�I take over the machine then�(or provoke segmentation fault)

32 of 44

Buffer Overflow Stack Example

0x7fffffffe490

0x7fffffffe40A

Stack Frame

for main

Contents of gets

Before call to gets

0x7fffffffe40A

0x40059c

0x7fffffffe480

depending on input,�different termination behavior.

33 of 44

Malicious Use of Buffer Overflow

Input string contains byte representation of executable code

Overwrite return address with address of exploit code

When bar() executes ret, will jump to exploit code

int bar() {

char buf[64];

gets(buf);

...

return ...;

}

void foo(){

bar();

...

}

Stack after call to gets()

B

return

address

A

foo stack frame

bar stack frame

B

exploit

code

pad

data written

by gets()

payload:� exploit code (backwards) ++ pad ++ B

34 of 44

Exploits Based on Buffer Overflows

Buffer overflow bugs allow remote machines to execute arbitrary code on victim machines

Internet worm

Early versions of the finger server (fingerd) used gets() to read the argument sent by the client:

      • finger droh@cs.cmu.edu

Worm attacked fingerd server by sending phony argument:

      • finger “exploit-code padding new-return-address”
      • exploit code: executed a root shell on the �victim machine with a direct TCP �connection to the attacker.

Morris worm, 1988

how to stop this:

  • want to make it impossible to execute code on the stack.
  • want to make it hard to figure out where exploit code starts.
  • want to protect return address.

35 of 44

Avoiding Overflow Vulnerability

Use library routines that limit string lengths

fgets instead of gets

strncpy instead of strcpy

Don’t use scanf with %s conversion specification

      • Use fgets to read the string
      • Or use %ns where n is a suitable integer

/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� fgets(buf, 4, stdin);� puts(buf);�}

use a function that checks how much you read.

36 of 44

System-Level Protections

Randomized stack offsets

At start of program, allocate random amount of space on stack

Makes it difficult for hacker to predict beginning of inserted code

Nonexecutable code segments

In traditional x86, can mark region of memory as either “read-only” or “writeable”

      • Can execute anything readable

X86-64 added explicit “execute” permission

37 of 44

Stack Canaries

Idea

Place special value (“canary”) on stack just beyond buffer

Check for corruption before exiting function

GCC Implementation

-fstack-protector

-fstack-protector-all (default)

38 of 44

Setting Up Canary

echo:

. . .

mov %fs:20, %rax # Get canary

mov %rax, -8(%rbp) # Put on stack

xor %eax, %eax # Erase canary

. . .

/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� gets(buf);� puts(buf);�}

Return Address

%rbp

Stack Frame

for main

Content of gets

Before call to gets

Canary

Saved %rbp

random part of memory into rax

put canary just below base pointer

must overwrite canary to overwrite return address

39 of 44

Checking Canary

echo:

. . .

mov -8(%rbp), %rax # Retrieve from stack

xor %fs:20, %rax # Compare with Canary

je .L24 # Same: skip ahead

call __stack_chk_fail # ERROR

.L24:

. . .

/* Echo Line */�void echo()�{� char buf[4]; /* Way too small! */� gets(buf);� puts(buf);�}

Return Address

%rbp

Stack Frame

for main

Content of gets

Before call to gets

Canary

Saved %rbp

%rbp-0x8

%rbp-0x10

40 of 44

Canary �Example

41 of 44

Canary �Example

when compiled w/ protection

42 of 44

GCC protections

Compile-time

Run-time

pretty clear that you shouldn’t use gets.

43 of 44

Willard Rafnsson�IT University of Copenhagen

wilr@itu.dk�https://www.willardthor.com/

goal: tools that developers can use to write secure SW.

sample research (past supervisions):

  • analyze binaries for information leaks
  • reduce timing leaks in the Linux kernel
  • automatically fix vulnerabilities in JavaScript
  • automatically generate (i.e. synthesize) �a secure program from formal specification
  • assess privacy risk in analytics programs �(data scientists; Google search for “Privugger”)

I like code, and I like proofs.

I created the “Applied Information Security” course.

I’m a barista in Analog.

formal�methods

programming�languages

computer�security

my�research

shameless self-promotion

44 of 44

Take-Aways

Alignment is necessary when working with structs

Object files are sequences of hex codes, that can be mapped to assembly instructions

Buffer overflows cause vulnerabilities that can be exploited maliciously

Counter measures include (i) randomized stack addresses, (ii) non executable code on stack and (iii) canaries.

· 44

01.10.2020