OSDev.org

Posted: **Wed Apr 24, 2013 3:24 am**

The CIL bytecode (aka MSIL) used in .NET assemblies is actually really nicely designed. It is sufficiently high-level as to allow the Just-In-Time compiler the most possible freedom (e.g. to optimize), while at the same time smaller than most instructions in real machine code. Contrary to what tjmonk15 says, there is no one-to-one correspondence between CIL instructions and machine code. However, naturally, the basic arithmetic, comparison, jump and bitwise instructions (add, cmp, br, xor) are present in allmost all instruction sets, including CIL.

CIL is a stack-based language: most operands are put on the (imaginary) evaluation stack. This saves a lot of space in the instruction encoding, since those operands are implicit. For example, a method might have the following CIL content:

Code: Select all

{
    ldc.i4.5    // 0x1B: Load constant 5
    ldc.i4.1    // 0x17: Load constant 1
    add         // 0x58: Add
    ret         // 0x2A: Return
}

This method has a bytecode content of just 4 bytes. The first two instructions push a value on the evaluation stack, the third pops two values and pushes a new one, and the last one pops a value and returns it from the method. It is not very hard to convert stack-based bytecode to something that uses registers.

In CIL an opcode consists of one or two bytes. If the first byte is one of the reserved values, then a second byte follows. This is mostly used to encode a short and a long variant of the same instruction. There is also space for one or more explicit operands. For example:

Code: Select all

// Push pre-defined constant
ldc.i4.1			// 0x17
// Push 8-bit constant
ldc.i4.s 0xAB		// 0x1F 0xAB
// Push 32-bit constant
ldc.i4 0xDEADBEEF	// 0x20 0xEF 0xBE 0xAD 0xDE

// Push method argument by pre-defined index
ldarg.0				// 0x02
// Push method argument by 8-bit index
ldarg.s 4			// 0x0E 0x04
// Push method argument by 16-bit index
ldarg 0x123			// 0xFE 0x09 0x23 0x01

If you want to know more about CIL bytecode, I suggest you read ECMA 335. It is a very readable specification. The instructions are in section III.

Posted: **Wed Apr 24, 2013 7:51 am**

Afaik, it's relatively hard to convert from stack based (CIL, Java) bytecode to 3-address machine code (x86, arm); that's one of the reasons why Dis is 3-address bytecode.

Posted: **Wed Apr 24, 2013 8:30 am**

dozniak wrote:Afaik, it's relatively hard to convert from stack based (CIL, Java) bytecode to 3-address machine code (x86, arm); that's one of the reasons why Dis is 3-address bytecode.

Nonsense. Both stack and register machines are translated to SSA using similar mechanisms.

If I was using LLVM, I'd be very tempted to just alloc one real stack location per stack slot/virtual register and then use LLVM's mem2reg pass to lift the code into SSA form where possible.

Posted: **Wed Apr 24, 2013 9:36 am**

To convert stack-based bytecode to register-based bytecode, you do the following:

Make an evaluation stack-like class that can push, pop and can return a virtual register associated with each stack slot. E.g. %slot0, %slot1, ...
Split the bytecode of a method into basic blocks (blocks of code for which control flow only ever enters at the first instruction and leaves at the last).
Get the content of the evaluation stack from one of the preceding blocks, or an empty stack if it is the first basic block.
For each instruction:
1. For each incoming stack operand: get the corresponding register by asking the stack-like class for the register associated with the top-most element, and then pop that from the stack.
2. For each outgoing stack operand: push it on the stack-like class.

Short example:

Code: Select all

ldc.i4 10
ldc.i4 5
ceq
ret

ldc.i4 10 pushes an integer on the stack. The top-most stack slot now has register %slot0.
ldc.i4 5 pushes another integer on the stack. The top-most stack slot now has register %slot1.
ceq would take the top two values from the stack. So it takes a look: %slot1 is the top-most register. Pop it. %slot0 is now the top-most register. Pop it. Push an integer. The top-most stack slot is %slot0.
ret looks at the stack: %slot0 is the top-most register. Pop it.
Done!

So the result as register-based pseudo-bytecode (op dest <- src1, src2) would be:

Code: Select all

ldc.i4 %slot0 <- 10
ldc.i4 %slot1 <- 5
ceq %slot0 <- %slot0, %slot1
ret <- %slot0

Now all that's left is assigning the unlimited virtual registers to the limited pool of physical machine registers in the most efficient manner possible.

Posted: **Wed Apr 24, 2013 1:28 pm**

Nice! Thanks guys, learned something new today!

OSDev.org

bytecode design

Re: bytecode design

Re: bytecode design

Re: bytecode design

Re: bytecode design

Re: bytecode design