1 of 41

CS61C: Great Ideas in Computer Architecture (aka Machine Structures)

Lecture 9: RISC-V Part 2: Data Transfer

Instructors: Lisa Yan, Justin Yokota

#

CS 61C

Spring 2024

2 of 41

Agenda

  • Main Memory
  • Data Transfer Instructions
  • C code examples

2

CS 61C

Spring 2024

3 of 41

Agenda

  • Main Memory
  • Data Transfer Instructions
  • C code examples

3

CS 61C

Spring 2024

4 of 41

RV32 So Far…

  • Addition/subtraction

add rd, rs1, rs2

rd = rs1 + rs2

sub rd, rs1, rs2

rd = rs1 - rs2

Where rd, rs1, rs2 are registers (x0 - x31)

  • Add immediate

addi rd, rs1, imm

rd = rs1 + imm

  • (New) Load immediate

li rd, imm

rd = imm

Not a "real" instruction, because we can use existing instructions to do the same thing:

li rd, 5 -> addi rd, x0, 5

However, this is common enough that RISC-V includes this as a pseudoinstruction. Pseudoinstructions get replaced with their real instruction counterpart by the assembler.

4

CS 61C

Spring 2024

5 of 41

Data Transfer: Load from and Store to memory

5

Very fast, �but limited space to hold values!

Much larger place �to hold values, �but slower than �registers!

CS 61C

Spring 2024

6 of 41

Review: How memory works

  • On a 32-bit system, main memory contains 232 bytes. Every 32-bit number acts as the address of one byte.
  • 4 bytes together make a word. In order to store a word in memory, we cut the word up into 4 bytes, then store those bytes in consecutive addresses. In order to read a word, we read 4 consecutive addresses, then stitch those bytes back together
  • RISC-V uses little-endian to store data: The least significant byte gets stored at the lowest address.
  • By convention we say that a word is stored at its lowest address
    • Ex. The word stored at address 0x1000 is composed of the bytes 0x1000, 0x1001, 0x1002, and 0x1003

6

CS 61C

Spring 2024

7 of 41

Agenda

  • Main Memory
  • Data Transfer Instructions
  • C code examples

7

CS 61C

Spring 2024

8 of 41

Load Word

Load Word syntax:

  • lw rd imm(rs1)

Means: Compute imm+rs1, then load the 4 bytes at that address into rd

8

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

9 of 41

Load Word

Example: lw x10 12(x5) if x5 is 0x100

  • 0x100+12 = 0x10C
  • Bytes at 0x10C-0x10F are 0x53, 0x42, 0x56, 0x00
  • Since RISC-V is little-endian, this is the 32-bit value 0x0056 4253
  • So register x10 will now store 0x0056 4253

9

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

10 of 41

Store Word

Store Word syntax:

  • sw rs2 imm(rs1)

Means: Compute imm+rs1, then store the 4 bytes of rs2 into that address.

10

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

11 of 41

Store Word

Example: sw x10 0(x5) if x5 is 0x100, x10 is 0x1234 5678

  • 0x100+0 = 0x100
  • Since RISC-V is little-endian, the 32-bit value 0x1234 5678 gets split into bytes 0x78 0x56 0x34 0x12
  • So the bytes in memory 0x100, 0x101, 0x102, and 0x103 get set to 0x78, 0x56, x34, and 0x12, respectively.

11

Byte (0x)

78

56

34

12

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

12 of 41

Loading and Storing Bytes

In addition to word data transfers (lw, sw), RISC-V has byte data transfers:

  • lb rd imm(rs1)
  • sb rs2 imm(rs1)

Load and store one byte instead of a full word

Problem: If registers contain 4 bytes, how do we load/store only 1 byte?

12

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

13 of 41

Store Byte

Example: sb x10 0(x5) if x5 is 0x100, x10 is 0x1234 5678

For sb, we store the least significant byte.

In the above example, 0x78 is the LSB.

So x100 gets set to 0x78

13

Byte (0x)

78

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

14 of 41

Load Byte

Example: lb x10 0(x5) if x5 is 0x100

For lb, we extend the numeric value to a full 32 bits.

In the above example, we load the number 0xEF. What 32-bit number has the same numeric value as 0xEF?

Answer: It depends on your representation scheme

14

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

15 of 41

Sign- and Zero-extending

There are two main representation schemes used: unsigned numbers, and 2's complement

For unsigned numbers: 8-bit 0xEF = 239. 239 -> 32 bits is 0x000000EF. General rule: Fill the top bits with 0s (Zero-extension).

For signed numbers: 8-bit 0xEF = -17. -17 -> 32 bits is 0xFFFF FFEF.

General rule: Fill the top bits with the most significant bit of the number (Sign-extension).

15

Typo adjusted (-16 → -17)

See Ed

CS 61C

Spring 2024

16 of 41

Loading and Storing Bytes

In addition to word data transfers (lw, sw), RISC-V has byte data transfers:

  • lb rd imm(rs1)
  • lb rd imm(rs1) -> sign extend the byte
  • lbu rd imm(rs1) -> zero extend the byte
  • sb rs2 imm(rs1) -> store least significant byte only

Load and store one byte instead of a full word

16

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

CS 61C

Spring 2024

17 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

17

CS 61C

Spring 2024

18 of 41

18

CS 61C

Spring 2024

19 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

19

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x00000000

x12

0x00000000

x5

0x00000100

Current Line

Random garbage since data isn't set

CS 61C

Spring 2024

20 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

20

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x000093F5

x12

0x00000000

x5

0x00000100

Current Line

CS 61C

Spring 2024

21 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

21

Byte (0x)

EF

BE

AD

DE

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x000093F5

x12

0x00000000

x5

0x00000100

Current Line

CS 61C

Spring 2024

22 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

22

Byte (0x)

F5

93

00

00

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x000093F5

x12

0x00000000

x5

0x00000100

Current Line

Little-Endian order

CS 61C

Spring 2024

23 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

23

Byte (0x)

F5

93

00

00

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x000093F5

x12

0x00000000

x5

0x00000100

Current Line

CS 61C

Spring 2024

24 of 41

Example: What is in x12?

li x11,0x93F5

sw x11,0(x5)

lb x12,1(x5)

24

Byte (0x)

F5

93

00

00

43

53

36

31

43

20

52

49

53

42

56

00

Address (0x)

100

101

102

103

104

105

106

107

108

109

10A

10B

10C

10D

10E

10F

Register

Value

x11

0x000093F5

x12

0xFFFFFF93

x5

0x00000100

Current Line

Sign-extend: Top bit of 0x93 is 1, so fill with 1s

CS 61C

Spring 2024

25 of 41

Agenda

  • Main Memory
  • Data Transfer Instructions
  • C code examples

25

CS 61C

Spring 2024

26 of 41

Converting C code to RISC-V

So far, we've assumed that each variable gets stored in one register. What if we have more than 32 variables? Let's translate a program under the following restrictions:

  • Only registers x5, x6, and x7 may be modified, and only for intermediate calculations
    • We'll name them "t0", "t1", and "t2", for "temporary register 0-2"
  • x2 points to the start of a block of memory that we can use however we want
    • We'll name x2 "sp", for "stack pointer"

26

CS 61C

Spring 2024

27 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string"; // Array will get stored on stack�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

27

CS 61C

Spring 2024

28 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

Step 1: Assign each variable to some offset from sp.

  • Exact values don't matter as long as we're consistent

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

28

CS 61C

Spring 2024

29 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

li t0 5

sw t0 0(sp)

29

CS 61C

Spring 2024

30 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

li t0 0x73

sb t0 4(sp)

li t0 0x74

sb t0 5(sp)

li t0 0x72�sb t0 6(sp)�li t0 0x69�sb t0 7(sp)�li t0 0x6E�sb t0 8(sp)�li t0 0x67�sb t0 9(sp)

sb x0 10(sp)

30

CS 61C

Spring 2024

31 of 41

Converting C code to RISC-V (Better Approach)

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

li t0 0x69727473

sw t0 4(sp)

li t0 0x0000676E

sw t0 8(sp)

31

CS 61C

Spring 2024

32 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

Nothing

32

CS 61C

Spring 2024

33 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

lb t0 7(sp)

sb t0 52(sp)

33

CS 61C

Spring 2024

34 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

lw t0 0(sp)

lbu t1 52(sp)

add t2 t0 t1

sw t2 28(sp)

34

CS 61C

Spring 2024

35 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;c[a] = 20;

a: 0(sp)

b: 4(sp)

c: 12(sp)

d: 52(sp)

li t0 20

lw t1 0(sp)

sw t0 t1*4+12(sp)

slli t1 t1 2 #t1*=4

addi t1 t1 12

add t1 t1 sp

sw t0 0(t1)

35

CS 61C

Spring 2024

36 of 41

Converting C code to RISC-V

int a = 5;�char b[] = "string";�int c[10];�uint8_t d = b[3];�c[4] = a+d;�c[a] = 20;

li t0 5

sw t0 0(sp)

li t0 0x69727473

sw t0 4(sp)

li t0 0x0000676E

sw t0 8(sp)

lb t0 7(sp)

sb t0 52(sp)

lw t0 0(sp)

lbu t1 52(sp)

add t2 t0 t1

sw t2 28(sp)

li t0 20

lw t1 0(sp)

slli t1 t1 2 #t1*=4

addi t1 t1 12

add t1 t1 sp

sw t0 0(t1)

36

CS 61C

Spring 2024

37 of 41

Why we need so many registers

  • As the previous example showed, it's possible to write RISC-V with only a sp and three temporary registers
  • Why do we have 32 registers?

37

CS 61C

Spring 2024

38 of 41

RISC-V Guiding Philosophy

38

Extremely fast

Extremely expensive�Tiny capacity

Fast

Priced reasonably�Medium capacity

CS 61C

Spring 2024

39 of 41

Speed of Registers vs Memory

  • Given that
    • Registers: 32 words (128 Bytes)
    • Memory (DRAM): Billions of bytes �(2 GB to 96 GB on laptop)
  • and physics dictates…
    • Smaller is faster
  • How much faster are registers than DRAM??
    • About 50-500 times faster! �(in terms of latency of one access - tens of ns)
      • But subsequent words come every few ns

39

CS 61C

Spring 2024

40 of 41

Jim Gray’s Storage Latency Analogy:

How Far Away is the Data?

40

Jim Gray�Turing Award

B.S. Cal 1966

Ph.D. Cal 1969

Registers

Memory

1

100

My Head

1.5 hr

1 min

Sacramento

CS 61C

Spring 2024

41 of 41

And in Conclusion…

  • Memory is byte-addressable, but lw and sw access one word at a time.
  • A pointer (used by lw and sw) is just a memory address, we can add to it or subtract from it (using offset).
  • Memory can be used for variables we can't store in registers, but 100x slower than using registers directly
    • Use loads and stores as infrequently as possible!
  • New Instructions:

lw, sw, lb, sb, lbu

41

CS 61C

Spring 2024