2 of 58

How do our programs execute?

In COMP1[59]11:

We run a compiler (dcc?)
./hello
profit ??

What’s going on here? What’s even in hello?

3 of 58

Abiram’s show and tell: hard disk drives

Long-term, non-volatile storage

non-volatile means that data is preserved when power shuts off
relatively cheap (1TB = 1024GB of storage for under $50)

This is where we typically save files! *

eg. photos, videos, documents, C code, League of Legends

Okay - so my compiler spat out hello and saved it to my hard drive - what next?

* hard disk drives are much less common these days - replaced by SSDs which are functionally equivalent

4 of 58

Abiram’s show and tell: hard disk drives

Long-term, non-volatile storage

non-volatile means that data is preserved when power shuts off
relatively cheap (1TB = 1024GB of storage for under $50)

This is where we typically save files! *

eg. photos, videos, documents, C code, League of Legends

Okay - so my compiler spat out hello and saved it to my hard drive - what next?

* hard disk drives are much less common these days - replaced by SSDs which are functionally equivalent

5 of 58

Abiram’s show and tell: RAM (or ‘memory’)

A program needs to be ‘in memory’ in order for it to run

‘memory’ typically refers to RAM
Communicating between the CPU and drives is too slow

RAM is just a massive 1D array which we divide into sections

An address is really just an ‘index’ into that array

RAM is volatile (flushed when it loses power)

6 of 58

Abiram’s show and tell: RAM (or ‘memory’)

hello contains information on how to set up memory

What instructions does the CPU need to follow?
What strings do we need loaded into memory?
Variables take up room!

Global variables are relevant in this course!
What about local variables?

Where do we put malloced memory?

7 of 58

Abiram’s show and tell: the CPU

We have instructions in RAM!
The CPU can fetch an instruction from memory

An instruction consists of an operator, and zero or more operands

8 of 58

But wait…

We’ve discussed memory and storage drives as being a place to store things.

But how is information actually stored?
Computers are really just massive circuits

Can think of electricity as being off or on
0 or 1 - this is a base-2 system

All data on a computer is represented as binary behind-the-scenes

This will become incredibly important in Week 5

9 of 58

Abiram’s show and tell: the CPU

We have instructions in RAM!
The CPU can fetch an instruction from memory
Circuitry within the CPU decodes the instruction to determine what to do
The CPU then executes that instruction, before moving on to fetch the next instruction!

10 of 58

Inside a CPU

11 of 58

What can instructions do?

Computations: eg. add, subtract, multiply, divide, bitwise (Week 5), …
Load/store: no point having memory if we can’t modify it or read it
Branch: jump to execute different instructions

Can’t have logic (eg. if statements) if our program continues linearly

System calls: call-a-friend for help - more on this soon

and more!

12 of 58

A day in the life of a CPU - as C code

int program_counter = START_ADDRESS;

while (1) {

// Fetch an instruction from memory

int instruction = memory[program_counter];

// Move to the next instruction

program_counter++;

// Execute the next instruction

execute(instruction, &program_counter);

// ^ note: some instructions may

// modify the program counter

}

13 of 58

Writing instructions ourselves

In this course we will be writing CPU instructions ourselves instead of making a compiler do it.

Why might we do this?

Optimising code for performance

Less instructions = faster to execute = saving picoseconds!

Sometimes it’s necessary

eg. writing code to interact directly with a device (i.e. drivers)

Form a better understanding of how a compiled program executes

Primary reason in this course
Can be helpful when debugging
Also handy to identify security vulnerabilities and exploit binaries (see: COMP6447)

14 of 58

Assembly

Instructions are really just 0s and 1s

Would be a pain to read/write literal instructions
Instead, we use assembly language to form a human-readable representation of each instruction

Each instruction we write in assembly language typically represents a single CPU instruction
An assembler translates this to binary CPU instructions

15 of 58

A sample instruction + assembly

00100001000010010000000000001100

addi $t1, $t0, 12

16 of 58

Assembly

may also add some niceties such as constants
give us a way to configure other memory sections (for global variables, strings, etc.)
give us labels - a way to name points in memory without having to deal with addressses

17 of 58

Instruction sets

Different types of CPUs may speak different ‘languages’

That is, they understand a different set of instructions
Some instruction sets may be more complex than others
Influenced by different design choices

Some examples include x86, ARM, PowerPC, RISC V
In COMP1521, we learn the MIPS instruction set architecture

Relatively simple and well-known architecture

Once used everywhere from console to supercomputers
Still sometimes used in routers, TVs

Lots of learning resources available
Good stepping stone if you need to branch out to other ISAs

18 of 58

“But I don’t have a MIPS CPU!”

We can’t run our MIPS instructions on our x86-64/ARM CPUs.

Instead, we use an emulator called mipsy:

recreates the behaviour of a real MIPS CPU

written by Zac* (past course admin, now graduated and lecturing COMP6991)
can optionally download and run on your own machine: https://github.com/insou22/mipsy/
comes with a command-line interface to run in your terminal

mipsy_web builds on top of mipsy and runs entirely in your browser

written by Shrey* and linked on course website: https://cgi.cse.unsw.edu.au/~cs1521/mipsy

vscode extension

written by Xavier 🎉 - can download the ‘mipsy editor features’ extension

* some contributions from Josh Harcombe, Dylan Brotherston and me :)

19 of 58

When will he shut up and actually write a MIPS program?

20 of 58

soon™

two more things to cover.

21 of 58

Registers

memory is fast, but not fast enough
still physically separate from the rest of the CPU

The CPU has a small amount of storage on the chip itself:

cache: not covered in COMP1521, keeps copies of frequently accessed memory
registers:

32 general-purpose registers (32-bits each, same size as a typical C integer)
floating point registers used for non-integer arithmetic, not covered in COMP1521
Hi/Lo are special registers used for mult/div - not too important in this course
program counter keeps track of which instruction to fetch and execute next

modified by branch/jump instructions

22 of 58

Registers

Almost all of our computations happen between registers!

Want to multiply 2 and 3 and store the result�Load 2 and 3 into registers:

And store the result:

li $t0, 2

li $t1, 3

mul $t2, $t0, $t1

23 of 58

Registers

Registers are denoted by a $ and can be referred to using a number ($0…$31) or by symbolic names ($zero…$ra)

$zero ($0) is special!

Always has the value 0 -> attempts to change it have no effect

$ra ($31) is also special!

Directly affected by two instructions we use in Week 3

24 of 58

Registers

Could use the other 30 registers however we please technically, but there are some conventions we have to follow - will be discussed in next week’s tutes + Week 3 lectures.

25 of 58

Relevant registers (for now)

$t0 to $t9 are free real estate - can use however we want
Will also need $v0, $a0, $ra for certain things at the moment
Should not need to use any other registers (yet)

We will cover the other registers when we talk about functions in Week 3

26 of 58

System calls

Our programs are useless!

Let’s go back and look at the types of instructions mentioned earlier:

27 of 58

What can instructions do?

Computations: eg. add, subtract, multiply, divide, bitwise (Week 5), …
Load/store: no point having memory if we can’t modify it or read it
Branch: jump to execute different instructions

Can’t have logic (eg. if statements) if our program continues linearly

Move: copy values between registers
System calls: call-a-friend for help 👀

and more!

28 of 58

System calls

None of the instructions we have access to can interact with the outside world (eg. printing, scanning)
Instead, we request the operating system to perform these tasks for us - this process is called a system call

The operating system can access privileged instructions on the CPU (eg. communicating to other devices)
mipsy simulates a very basic operating system
Will explore real system calls and their raison d’etre in the second half of the course

29 of 58

Common mipsy syscalls

We won’t use syscalls 8, 12 much in COMP1521 - most input will be integers.

30 of 58

Other mipsy syscalls - seldom used

Probably not needed for COMP1521 - except maybe challenge exercises/provided code.

31 of 58

The system call workflow

We specify which system call we want in $v0

eg. print_int is syscall 1:

We specify arguments (if any)

We transfer execution to the operating system

The OS will fulfil our request if it looks sane

Some syscalls may return a value - check syscall table

li $v0, 1

li $a0, 42

syscall

32 of 58

MIPS and mipsy documentation

Literally your best friend (it’ll even be there for you in the exam 🥺)

33 of 58

Lecture chat

Place to ask questions/make comments in the lecture (mostly) anonymously, if you like

Can deanonymise if the need arises - please follow UNSW Code of Conduct
Don’t spam
Supports Discord Markdown!

Mild shitposting is fine, in moderation
Don’t make me blacklist you >:(

34 of 58

Lecture chat

https://cgi.cse.unsw.edu.au/~abiramn/accord

35 of 58

Recap of lec01

Exploring different types of storage/memory
RAM contains everything a program needs in a given moment
Instructions!
Assembly language!
Registers!
System calls!

36 of 58

The system call workflow

We specify which system call we want in $v0

eg. print_int is syscall 1:

We specify arguments (if any)

We transfer execution to the operating system

The OS will fulfil our request if it looks sane

Some syscalls may return a value - check syscall table

li $v0, 1

li $a0, 42

syscall

37 of 58

Finally, we can write hello world.

38 of 58

DISCLAIMER:

Code written in lectures may not necessarily have the best style!

Lecture code is meant to be quick and dirty, to demonstrate a concept
Will quickly overview good style soon, but refer to your tutor, tut solutions, lab solutions

39 of 58

li vs la vs move

li (load immediate) is for immediate, fixed values that you need to load into a register with an instruction
la (load address) is for loading fixed addresses into a register

remember, labels really just represent addresses!

move is for copying values between two registers

40 of 58

Syntax overview

Assembly language programs contain:

Assembly instructions, each on their own line

These are generally a 1:1 mapping from CPU instructions to real instructions
However, assemblers also provide pseudo-instructions for convenience
Some of these assembly pseudo-instructions turn into 2-3 real CPU instructions

li is an example - ask why on the forum if curious!

Labels … appended with :
Comments … starting with a #
Directives … symbol beginning with .
Constant definitions - like #defines in C:

MAX_NUMBERS = 256

41 of 58

Style

We generally don’t indent to show structure

i.e no indenting within conditionals, if statements, etc.

Instead:

don’t indent labels
indent instructions by one step
have equivalent C code as inline comments

Huge recommendation: indent with 8-wide tabs

Ask on forum if anyone wants my vscode config

42 of 58

Simplified C

Translating C code directly to MIPS is not fun

Pro strat - simplify your C code and then translate it:

Map down to ‘simplified’ C

Simplified C is generally written so that each line of C code maps to one MIPS instruction
Compile your simplified C and make sure it still works as expected
Translate each line of simplified C to MIPS
Profit!!

43 of 58

MIPS Control

COMP1521 23T2: lec02

44 of 58

So far…

All of our programs so far have implemented fixed, predictable behaviour.

Execute linearly - that is, we always go down to the next instruction

However, what if we want to implement logic in our code?

If statements, where we may not always execute the same code, depending on a condition
For/while loops, where we may want to repeat the same instructions?

if/else and loops don’t exist in MIPS - we have to use branching to implement these ourselves

45 of 58

Branch/jump instructions

Allows you to transfer the flow of execution to a different instruction conditionally

except b, which is unconditional

Also j, jal, jalr, jr - unconditional jump instructions which we will talk about in MIPS Functions
Can replace with a constant in mipsy

46 of 58

In other words

A lot of these branch instructions are of the form:

“if condition is true, jump to instruction”

How do we implement this for our simplified C code?

47 of 58

COMP1511 staff hate this one simple trick!

In C, goto allows jumping to any arbitrary point within a program - as long as we define a label - meaning we can effectively yeet around within a program however we wish.

48 of 58

Simplifying if, if/else:

print_if_even, odd_even

49 of 58

goto is cool for simplification!

but don’t use it in your actual C programs.

goto makes programs more difficult to read
goto makes it hard for compilers to optimise code, resulting in slower programs
In general, do not use goto without good reason!

Typically only kernel/embedded programmers use goto

50 of 58

More complex conditionals: || and soft serve machines

if (milk_age > 48 ||

milk_level < 10) {

printf("Replace milk\n");

} else {

printf("Milk okay!\n");

}

printf("Done!\n");

if (milk_age > 48) goto milk_replace;

if (milk_level < 10) goto milk_replace;

printf("Milk okay!\n");

goto milk_replace__end;

milk_replace:

printf("Replace milk\n");

milk_replace__end:

printf("Done!");

51 of 58

More complex conditionals: &&

if (x >= 0 && x <= 100) {

// in bounds

} else {

// out of bounds

}

return 0;

if (x < 0 || x > 100) {

// out of bounds

} else {

// in bounds

}

return 0;

Invert the condition to use || (De Morgan’s Law)

52 of 58

More complex conditionals: &&

if (x < 0 || x > 100) {

// out of bounds

} else {

// in bounds

}

return 0;

Split into separate conditionals:

if (x < 0) goto x_out_of_bounds;

if (x > 100) goto x_out_of_bounds;

// in bounds

goto epilogue;

x_out_of_bounds:

// out of bounds

epilogue:

return 0;

53 of 58

Simplifying loop structures

for loops should be broken down to while loops
while loops should be broken down into if/goto

General structure:

loop init
loop condition (do we need to exit the loop?)
loop body
loop step
loop end

Use labels to show structure!

54 of 58

Counting to 10

for (int i = 0; i < 10; i++) {

printf("%d\n", i);

}

int i = 0;

while (i < 10) {

printf("%d\n", i);

i++;

}

55 of 58

Counting to 10

int i = 0;

while (i < 10) {

printf("%d\n", i);

i++;

}

loop_i_to_10__init:;

int i = 0;

loop_i_to_10__cond:

if (i >= 10) goto loop_i_to_10__end;

loop_i_to_10__body:

printf("%d", i);

putchar('\n');

loop_i_to_10__step:

i++;

loop_i_to_10__end:

// ...

56 of 58

Simplifying for loops:

sum_100_squares

57 of 58

Sidenote: C break/continue

break can be used in a loop to completely exit the loop.

The loop condition here makes this look like an infinite loop:

but break means it’s possible for the loop to be exited.�In simplified C/MIPS, a break is really just equivalent to going to the loop’s end label.

Avoid writing C code with break where possible.

while (1) {

int c = getchar();

if (c == EOF) break;

}

58 of 58

Sidenote: C break/continue

continue can be used to proceed to the next iteration of a for loop.

This would be a (terrible) way to print even numbers:

In simplified C/MIPS, a continue is really just equivalent to going to the loop’s step label.

Avoid writing C code with continue where possible.

for (int i = 0; i < 10; i++) {

if (i % 2 != 0) continue;

printf("%d\n", i);

}