Introduction to RISC-V

Resource#

In assembly language, there is no concept of variables; assembly language typically operates on registers. The operands of arithmetic instructions must come from registers, special locations built into the hardware (in the CPU?).

Registers are central processing units used to temporarily store instructions, data, and addresses in computer memory. The storage capacity of registers is limited, but the read and write speeds are very fast. In computer architecture, registers store intermediate results of calculations made at known points in time, speeding up the execution of computer programs by quickly accessing data.

RISC-V Card#

RISC-V Operands#

If the size of the register is 64 bits, it is called a double word; if 32 bits, it is a single word.
x₀ is hard-wired to 0
- add x3, x4, x0 => x3 = x4 (x0 is hard-wired to value 0)

Assembly Instructions#

Storing Operands#

The instruction that copies data from memory to a register is called a load instruction (load). In RISC-V, the instruction is ld, which means load double word.

A C program that retrieves a value from an array, write the assembly code#

g = h + A[8];

A is an array of 100 double words, g and h are stored in x20 and x21, respectively, and the base address of the array is located in x22.

ld x9, 8(x22) // x9 = A[8]
add x21, x20, x9; // x21 = x20 + x9

The register storing the base address (x22) is called the base register, and the 8 in the data transfer instruction is called the offset.

{{< block type="tip" title="Big-endian and Little-endian addressing">}}
Computers are divided into two types: one uses the leftmost or "big-end" byte's address as the double word address, while the other uses the rightmost or "little-end" byte's address as the double word address.

RISC-V uses little-endian. Since byte order only matters when accessing the same data in double word and 8 individual bytes, in most cases, there is no need to worry about "endianness".
{{< /block >}}

Thus, to obtain the correct byte address for the above code, the offset added to the x22 register is 64 (8x8).

The instruction that is the opposite of the load instruction is usually called the store instruction (store), which copies data from a register to memory. The instruction is sd, which means store double word.

{{< block type="tip">}}
In some architectures, the starting address of a word must be a multiple of 4, and the starting address of a double word must be a multiple of 8. This requirement is called alignment constraint.
{{< /block >}}

RISC-V and Intel x86 do not have alignment constraints, but MIPS does.

Using load and store to compile generated instructions#

A[12] = h + A[8];

h is stored in x21, and the base address of A is stored in x22.

ld x9, 64(x22)  // x9 = A[8]
add x9, x21, x9 // x9 = h + A[8]
sd x9, 96(x22)  // A[12] = x9

Compile a string copy program into assembly#

void strcpy(char x[],char y[]){
	size_t i;
	i = 0;
	while((x[i] = y[i]) != '\0'){
		i += 1;
	}
}

The base addresses of x and y are stored in x10 and x11, respectively, and i is stored in x19.

strcpy:
	addi sp, sp, -8  // Adjust stack pointer to store one item (x19)
	sd x19, 0(sp)    // Push x19 onto the stack
	add x19, x0, x0  // x19 = 0 + 0
L1: add x5, x19, x11 // x5 = x19 + x11 => address of y[i] in x5
	lbu x6, 0(x5)    // temp: x6 = y[i]
	add x7, x19, x10 // x5 = x19 + x11 => address of x[i] in x7
	sd  x6, 0(x7)    // x[i] = y[i]
	beq x6, x0, L2   // if x6 == 0 then go to L2
	addi x19, x19, 1 // i = i + 1
	jal x0, L1       // go to L1
L2: ld x19, 0(sp)    // Restore x19 and stack pointer
	addi sp, sp, 8 
	jalr x0, 0(x1)

A loop code compiled into assembly#

int A[20];
int sum = 0;
for (int i = 0; i < 20; i++){
	sum += A[i];
}

RISC-V assembly (32 bit)

	add x9, x8, x0     # x9 = &A[0]
	add x10, x0, x0    # sum
	add x11, x0, x0    # i
	addi x13,x0, 20    # 20
Loop:
	bge x11, x13, Done # if x11 >= x13 go to Done (end loop)
	lw x12, 0(x9)      # x12 = A[i]
	add x10, x10, x12  # sum
	addi x9, x9, 4     # x9 = &A[i+1]
	addi x11, x11, 1   # i++
	j Loop
Done:

Logical Operations#

and andi
- and x5, x6, x9 => x5 = x6 & x9
- addi x5, x6, 3 => x5 = x6 & 3
sll ssli, left shift (expand)
- slli x11, x23, 2 => x11 = x23 << 2
- 0000 0010 => 2
- 0000 1000 => 8
srl srli, right shift (reduce)
- srli x23, x11, 2 => x23 = x11 >> 2
- 0000 1000 => 8
- 0000 0010 => 2
sra srai, arithmetic right shift
- 1111 1111 1111 1111 1111 1111 1110 0111 = -25
- srai x10, x10, 4
- 1111 1111 1111 1111 1111 1111 1111 1110 = -2

Helpful RISC-V Assembler Features#

a0 - a7 are parameter registers (x10 - x17, used for function calls).
zero represents x0.
mv rd, rs = addi rd, rs, 0.
li rd, 13 = addi rd, x0, 13.
nop = addi x0, x0.
la a1 Label loads the address of Label into a1.
a0 - a7 (x10 - x17): 8 registers for parameter passing and two return values (a0 - a1).
ra (x1): a return address register, used to return to the original point (the calling location).
s0 - s1 (x8 - x9) and s2 - s11 (s18 - x27): saved registers.

RISC-V Function Call Conventions#

Registers are faster than memory, so use them.
jal rd, Label jump and link.
1. jal x1, 100.
jalr rd, rs, imm jump and link register.
1. jalr x1, 100(x5).
jal Label => jal ra, Label to call a function.
jalr s1 when s1 is a method pointer, this is a function call.

A function call converted to assembly#

...
sum(a,b);
...

int sum(int x, int y){
	return x + y;
}

1000 mv a0, s0              # x = a
1004 mv a1, s1              # y = b
1008 addi ra, zero, 1016    # 1016 is sum function
1012 j                      # jump to sum
1016 ... 
...
2000 sum: add a0, a0, a1
2004 jr ra

1008 ~ 1012 can be replaced with jal sum.

Basic Steps for Calling a Function#

Place the required parameters in a location accessible to the method (registers).
Transfer control to the function using (jal).
1. Save the address and jump to the function's address.
Obtain the (local) storage resources needed for the function execution.
Execute the expected function.
Place the return value in a location accessible to the calling code, restore the registers used, and free local storage.
Return control to the main processor (ret), using the address stored in the register to return to the calling location.

Method Call Example#

int leaf(int g, int h, int i, int j){
	int f;
	f = (g + h) - (i + j);
	return f;
}

g, h, i, j in a0, a1, a2, a3.
f in s0.
temp is s1.

leaf:
	# prologue start
	addi sp, sp, -8   # Make space for 8 bytes to store 2 integers
	sw s1, 4(sp)      # Save s1, s0 to sp
	sw s0, 0(sp)
	# prologue end
	add s0, a0, a1    # f = g + h
	add s1, a2, a3    # temp = i + j
	sub a0, s0, s1    # a0 = (g + h) - (i + j) 

	# epilogue
	lw s0, 0(sp)      # Restore s1, s0
	lw s1, 4(sp)    
	addi sp, sp, 8 

	jr ra

sp#

sp is the stack pointer, which grows downwards from the top of the memory space, using the x2 register in RISC-V.

push decreases the pointer address of sp.
pop increases it.

Each function has a set of data stored on the stack, known as the stack frame, which typically includes:

Return address.
Parameters.
Space for local variables used.

Nested Function Calls#

int sumSquare(int x,int y){
	return mult(x,x) + y;
}

The return value for sumSquare is stored in ra, but this value will be overwritten by the call to mult.

caller: the function that calls another function.
callee: the function being called.
When the callee returns from execution, the caller needs to know which registers may have changed and which are guaranteed to remain unchanged.
Register conventions: which registers will be clobbered after a program call (jal), and which can be changed.
1. Some registers are volatile (temp), while others must be preserved (the caller must restore their original values).
2. This optimizes the number of registers that need to be saved when entering a stack frame.
Classification:
1. Cross-function call preserved:
  1. sp, gp, tp.
  2. s0 - s11 (s0 is also fp).
2. Not preserved:
  1. Parameter registers and return registers: a0 - a7, ra.
  2. Temp registers: t0 - t6.

The RISC-V code for the above:

x in a0, y in a1.

sumSquare:
	addi sp, sp, -8
	sw ra, 4(sp)             // save return address to sp
	sw a1, 0(sp)             // save s1 to y
	mv a1, a0                // y = x => mult(x,x)
	jal mult                 // call mult
	lw a1, 0(sp)             // get y from sp
	add a0, a0, a1           // mult() + y
	lw ra, 4(sp)             // get return address from sp
	addi sp, sp, 8
	jr ra

RISC-V Register Names#

RISC-V Method Call Pattern#

matmul:  
    # Push to stack, make space to save several s registers we will use
    addi sp, sp, -36  
    sw ra, 0(sp)  
    sw s0, 4(sp)  
    sw s1, 8(sp)  
    sw s2, 12(sp)  
    sw s3, 16(sp)  
    sw s4, 20(sp)  
    sw s5, 24(sp)  
    sw s6, 28(sp)  
    sw s7, 32(sp)  
body:
    # xxx xxx

end:  
    # Restore the values of the registers  
    lw ra, 0(sp)  
    lw s0, 4(sp)  
    lw s1, 8(sp)  
    lw s2, 12(sp)  
    lw s3, 16(sp)  
    lw s4, 20(sp)  
    lw s5, 24(sp)  
    lw s6, 28(sp)  
    lw s7, 32(sp)  
    addi sp, sp, 36  
    ret

RISC-V Instruction Binary Representation#

R Format Layout#

Used for arithmetic and logical operations.

opcode, funct3, funct7: will tell us whether to perform addition, subtraction, left shift, XOR, etc.
1. The opcode for R-format is fixed at 0110011.
An add operation: add x18 x19 x10 => x18 = x19 + x10.
0000000 01010 10011 000 10010 0110011.
rs2 = x19, rs1 = x10, rd = x18.

I Format Layout#

Handles immediate values, for example, addi rd rs1, imm => addi a0 a0 1.

The range of imm is -2084 to 2047.

RISC-V Loads#

Load instructions are also of I type.

fzdwx