1 of 58

COMP 520: Compilers!

Lecture 13: x86 Details!

1

2 of 58

Announcements + Logistics

  • Written Assignment due Sunday
    • I will post some links to x86 resources, but we will also basically give the answers away today (hopefully!) ((except for Q4))
  • I hope you are working on PA3!! You can do this!

2

3 of 58

x86 Basics Continued!

3

4 of 58

From Yesterday: Register File

  • Lots of registers, but here are the 6 general purpose registers that we are concerned with:

rax, rcx, rdx, rbx, rsi, rdi

4

5 of 58

From Yesterday: Register File

  • Lots of registers, but here are the 6 general purpose registers that we are concerned with:

rax, rcx, rdx, rbx, rsi, rdi

  • Question: what does that R prefix mean again?

5

6 of 58

REX Prefix

  • If we want to use 64-bit registers in assembly operations, we need to use something called the “REX” prefix.�
  • x86_64 requires you to use “prefix” bytes to determine whether you’re using 32-bit registers or 64- bit registers and operands

6

7 of 58

REX Prefix

  • Register Index from [0,7]

RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI

  • New registers use the same index [0,7]

R8, R9, R10, R11, R12, R13, R14, R15

7

8 of 58

REX Prefix: [regB+regX*8+520],regW

8

9 of 58

Thus, prefix bytes control registers

  • Which register gets used is done in the prefix. �
  • If no prefix is specified, x86_64 will assume 32-bit for the most part
    • This can result in incorrect computation!
    • We will see examples of this next week..

9

10 of 58

Jump/Branch

jmp rax

jmp 4000 0096

Unconditional Jump (not very interesting)

Jump to some other location in executable code

10

11 of 58

Jump/Branch

 

11

12 of 58

Jump/Branch

 

12

13 of 58

CFLAGs

CF, PF, AF, ZF, SF, OF

Idea: When comparing two parameters, generate everything! Are they equal? Is a<b? Etc.

13

14 of 58

CFLAGs

 

14

15 of 58

Comparison of code

Code that you see

if( rax == 3 )

rcx = 4;

else

rcx = 5;

print( rcx )

Code that CPU sees

cmp rax,3

je IsEqual

mov rcx,5

jmp End

IsEqual: mov rcx,4

End: push rcx

call print

15

16 of 58

Stack pointer(s)

rsp, rbp

rsp= stack pointer

rbp= stack base pointer

16

17 of 58

Stack pointer(s)

rsp, rbp

rsp= stack pointer

rbp= stack base pointer

Used for: (1) parameters in a function call,�(2) temporary variables, (3) stack framing for more temp variables, and more!

17

18 of 58

Stack Pointer

  • When a function call is made, the return address is pushed onto the stack, and RSP is decremented. �
  • When a function returns, the return address is popped from the stack, and RSP is incremented.�
  • Push/Pop are manipulating this

18

19 of 58

Base Stack Pointer

  • AKA frame pointer..

  • At the beginning of a function, the current RBP value is pushed onto the stack, and then RBP is set to the current RSP value. This creates a new stack frame. At the end of the function, the original RBP value is restored.
  • It remains constant throughout the function execution, unlike RSP, which changes with each stack operation.

19

20 of 58

Stack Growth

Push/Pop data on/off the stack.

Each “entry” is 8 bytes�in this example.

�For 32-bit,�it would be 4 bytes.

20

0

0

0

4

65535

RSP

RBP

THE STACK

21 of 58

Stack Growth

Shown in the stack on the right.

Unintuitively, higher positions�are at lower memory addresses.��E.g. the number “4”�is at rbp-8, not rbp+8,�or rsp+8, not rsp-8

21

0

0

0

4

65535

RSP

RBP

LOWER�ADDRESSES

22 of 58

Stack Growth

Shown in the stack on the right.

Unintuitively, higher positions�are at lower memory addresses.��E.g. the number “4”�is at rbp-8, not rbp+8,�or rsp+8, not rsp-8

22

0

0

0

4

65535

RSP

RBP

LOWER�ADDRESSES

23 of 58

Local Variables

  • First covered usage of the stack: local variables!

23

24 of 58

Local Variables

  • Local variables in a method are not a static variable in .bss, nor .data

  • Two methods: use the heap (shown later), use the stack (recommended)

  • Question: Why not give every local a location in bss?

24

25 of 58

Local Variables

Consider:

push 4

25

4

Some other var

RIP

RSP

RSP

RBP

26 of 58

Local Variables

Consider:

mov [rbp-8],5

26

4 -> 5

Some other var

RIP

RSP

RBP

27 of 58

Local Variables

Consider:

mov [rbp-8],5

27

5

Some other var

RIP

RSP

RBP

28 of 58

Next use of the stack

  • How can we pass parameters to a method?

a = 3; b = 5;

someMethod( a, b );

28

29 of 58

Stack pointer(s)

rsp, rbp

someMethod( int a, int b );

push dword[b]

push dword[a]

call someMethod

29

30 of 58

Stack pointer(s) – How to call a method

rsp, rbp

someMethod( a, b );

push dword[b]

push dword[a]

call someMethod

30

[b] = 5

Some other var

RIP

RBP

RSP

RSP

31 of 58

Stack pointer(s) – How to call a method

rsp, rbp

someMethod( a, b );

push dword[b]

push dword[a]

call someMethod

31

[a] = 3

[b] = 5

Some other var

RIP

RBP

RSP

RSP

32 of 58

Stack pointer(s) – How to call a method

rsp, rbp

someMethod( a, b );

push dword[b]

push dword[a]

call someMethod

32

Return Address

[a] = 3

[b] = 5

Some other var

RIP

RBP

RSP

RSP

Return Address

33 of 58

Stack pointer(s) – How to call a method

rsp, rbp

At the end of someMethod:

ret

“Take address at top of stack”�“Set RIP to be that address”�“Pop the top of the stack”

33

Return Address

[a] = 3

[b] = 5

Some other var

RBP

RSP

RIP

34 of 58

Stack pointer(s) – How to call a method

rsp, rbp

someMethod( a, b );

push dword[b]

push dword[a]

call someMethod

add rsp,16

34

Return Address

[a] = 3

[b] = 5

Some other var

RSP,RBP

RSP

RIP

35 of 58

What does a CALLED method look like?

  • Stack framing will be covered in an upcoming lecture

  • Idea: “keep our local variables in the stack,�then when we call a method, create a new ‘frame’ where that method can store ITS OWN local variables”

35

36 of 58

Lastly, pop

  • Very straight forward,�
  • Reads the value at the top of the stack (rsp),�stores it in some destination register

pop rcx

  1. Adds 8 to rsp

36

37 of 58

Worksheet Question Q3

How to set-up a stackframe…

We know that we need:

  1. The push operation
  2. Initially rbp + rsp are the same, before we push other variables on

37

38 of 58

Worksheet Question Q3

Teardown is just undoing our work…

38

39 of 58

Stack Space in x86

  • Push and Pop always operate 64-bits at a time
  • This is because we are in 64-bit mode (long mode)
  • Thus, storing data on the stack will always be 8 bytes long

39

40 of 58

Back to Memory Organization

40

41 of 58

.text Segment

  • As specified earlier, this is where code is stored

  • Question: can code exist in other segments?�What about non-executable segments?

41

42 of 58

Static vs Dynamic memory

  • But where is THIS data stored?

int[] p = new int[ someVariableSize ];

Doesn’t fit our notions of .bss nor .data! (Why?)�If we use the stack, then we consume� a ton of stack space.

42

43 of 58

Dynamic memory is in the heap

int[] p = new int[ someVariableSize ];

The heap is just a memory location.

Simplest heap possible:

43

44 of 58

Super-simple heap

44

Heap Ptr

Heap Base: 0x8000 0000

Heap End: 0x8FFF FFFF

45 of 58

Super-simple heap

45

Heap Ptr

I want 3072 bytes of data!� malloc(3072)

new char[3072]

Heap Base: 0x8000 0000

Heap End: 0x8FFF FFFF

46 of 58

Super-simple heap

46

Heap Ptr

I want 3072 bytes of data!� malloc(3072)

new char[3072]

Returned start of allocated memory

Heap Base: 0x8000 0000

Heap End: 0x8FFF FFFF

47 of 58

What does this look like?

Consider:

47

48 of 58

How is something returned?

  • Assume [rbp+0E8h] is the variable a
  • Where is the returned value from the “new” operation?

48

49 of 58

How is something returned?

  • Assume [rbp+0E8h] is the variable a
  • Where is the return value from the “new” operation?

49

FASTCALL,�but assume push 8

50 of 58

Assumptions

  • We will assume that all functions/methods return their value in rax
  • So rax will be “reserved” when doing a call, but general purpose otherwise

50

51 of 58

What does this look like?

Consider:

51

push 8

call malloc

mov [a], rax

52 of 58

What does this look like?

Consider:

52

push 8

call malloc

mov [a], rax

mov [rax+0],3

53 of 58

What does this look like?

Consider:

53

push 8

call malloc

mov [a], rax

mov [rax+0],3

mov [rax+4],5

54 of 58

Field variables are offsets

54

x: From some base address, add +0

y: From some base address, add +4

55 of 58

So how did we figure out the alloc size of “A”?

  • It has two variables, both of�them are int, so assume�int means 4 bytes (we will�assume 8 bytes in miniJava),�then the alloc size of A will�be 8 bytes total.

55

56 of 58

So how did we figure out the alloc size of “A”?

  • It has two variables, both of�them are int, so assume�int means 4 bytes (we will�assume 8 bytes in miniJava),�then the alloc size of A will�be 8 bytes total.
  • Thus: push 8�call malloc

56

57 of 58

Coming up next!

  • How can we handle a.b.c.d.x?

  • What about classes like this:
  • What is their size?
  • (How many bytes is allocated�when creating a new A()?)

57

58 of 58

Review this content!

  • Basic assembly operations: mov from memory, add, subtract, store to memory (mov), call, push, pop, jmp, cmp, conditional jumps (jle,jl,jge,jg,je,jne)

  • Know about stack, heap, .bss, .data, .text

58