RISC-V branch offset machine instruction encoding

What's the decoded RISC-V assembly instruction of: 0001100 01010 11100 100 10001 1100011 ? From the specification i know, that the opcode is the BLT instruction and rs1 = x28, rs2 = x10. But what is the encoded offset? imm[12|10:5] is 0001100 = 12 and imm[4:1|11] is 10001 = -8, right? Where will the jump go?

354k 49 49 gold badges 684 684 silver badges 933 933 bronze badges asked May 23, 2018 at 21:25 2,026 1 1 gold badge 23 23 silver badges 35 35 bronze badges You could use a disassembler to do it for you. I think GNU binutils has a RISC-V backend. Commented May 23, 2018 at 21:55 Yes, risc-v is supported by binutils. Commented May 25, 2018 at 13:21

Sadly instruction set documentation does not completely or correctly cover the offsets, in general. Not uncommon to find there is an assumed offset, and or the immediate is number of instructions vs number of bytes or some such thing. So you normally have to either use an existing tool to figure this out, or that probably means you work there and you just walk over to one of the chip folks cubes and ask.

Commented May 25, 2018 at 13:24

make a file with .word 0x18ae48e3 in it, assemble then disassemble. link it to some non-zero address then disassemble again.

Commented May 25, 2018 at 14:03

Thanks for your help @old_timer. I assembled a file containing main: word .0x18ae48e3 with riscv64-unknown-elf-gcc. By disassemblinmg the binary, gdb gave me 0x0000000000000000 : blt t3,a0,0x990 . So i'm assuming the decoded offset is 0x990.

Commented May 27, 2018 at 17:40

2 Answers 2

The RISC-V Instruction Set Manual lists the complete instruction set in chapter 19. The opcode (the 7 least significant bits) tells us that we deal with a B-type instruction. The funct3 bits ( [14:12] ) specify the BLT instruction.

The BLT instruction is encoded as follows:

RISC-V instruction set BLT instruction

imm[12|10:5] rs2 rs1 funct3 imm[4:1|11] opcode
instruction 0001100 01010 11100 100 10001 1100011
value 0xc 0xa (x10) 0x1c (x28) 0x4 0x11 0x63

The immediate value is the concatenation of the instruction bits [31|7|30:25|11:8] : 0|1|001100|1000 = 0x4c8 . Note that the immediate value misses the bit at index 0 .

There are a further two variants of the instruction formats (B/J) based on the handling of immediates, as shown in Figure 2.3. The only difference between the S and B formats is that the 12-bit immediate field is used to encode branch offsets in multiples of 2 in the B format. Instead of shifting all bits in the instruction-encoded immediate left by one in hardware as is conventionally done, the middle bits (imm[10:1]) and sign bit stay in fixed positions, while the lowest bit in S format (inst[7]) encodes a high-order bit in B format.

This is because RISC-V has a 16-bit instruction alignment constraint (1.2 Instruction Length Encoding):

The base RISC-V ISA has fixed-length 32-bit instructions that must be naturally aligned on 32-bit boundaries. However, the standard RISC-V encoding scheme is designed to support ISA extensions with variable-length instructions, where each instruction can be any number of 16-bit instruction parcels in length and parcels are naturally aligned on 16-bit boundaries. The standard compressed ISA extension described in Chapter 12 reduces code size by providing compressed 16-bit instructions and relaxes the alignment constraints to allow all instructions (16 bit and 32 bit) to be aligned on any 16-bit boundary to improve code density.

Ergo we need to add a trailing 0 to the offset, which gives us: 0|1|001100|1000|0 = 0x990 .

The decoded instruction is: blt x28, x10, 0x990