Qemu disassembler's output changed

Robert · Post by **Robert** » Mon Feb 01, 2021 6:38 am

Hi!
I'm working on the startup of application CPUs.
Copied their code in the first 1MB, sent the IPIs,
they started execution at the correct place.
But then things messed up.
Didn't get their output and sometimes my code seemed to change.
Usually used the 0x40000-0x60000 area for startups, no other thing accessed the first 1MB of RAM.
As I know, no memory mapped devices are in that area.

The code changes looked like this:

(qemu) x/10i 0x1011a6
0x001011a6: cli
0x001011a7: mov $0x6000,%ax
0x001011aa: add $0xf00,%ax
0x001011ad: add %dx,(%bx,%si)
0x001011af: hlt

(qemu) x/10i 0x1011a6
0x001011a6: cli
0x001011a7: mov $0x56000,%eax
0x001011ac: lgdtl (%eax)
0x001011af: hlt

Use qemu, i386 architecture. ( qemu-i386 -smp 3 ) Bootstrap CPU is in protected mode, APs are in real.
Copied the startup code byte-by-byte with a simple C++ for loop.

Do you have an idea what happened?

austanss · Post by **austanss** » Mon Feb 01, 2021 8:06 am

I mean it looks pretty much like a normal difference between real mode and protected mode. The lgdt instruction might be from the fact that GDT is mandatory in protected mode.

neon · Post by **neon** » Mon Feb 01, 2021 9:29 am

Hi,

It looks like you are disassembling it as 32 bit as opposed to 16 bit which would interpret the operand size and address size override prefixes differently thus resulting in misalignment (the lgdt is in the middle of two instructions). I.e. I believe it is a disassembling issue only.

Robert · Post by **Robert** » Mon Feb 01, 2021 9:58 am

neon wrote:Hi,

It looks like you are disassembling it as 32 bit as opposed to 16 bit which would interpret the operand size and address size override prefixes differently thus resulting in misalignment (the lgdt is in the middle of two instructions). I.e. I believe it is a disassembling issue only.

This came to my mind first. But if it was only a disassembler issue, my code would have worked.
And it did not.
I mean after INIT the AP core run in 32-bit real mode, meaning it should have load the GDT register.
But it stayed zero.

neon · Post by **neon** » Mon Feb 01, 2021 11:19 am

Hi,

I would be interested to see what the original bytecode was when it was assembled to compare with. With that said, it matches almost perfectly to what would be expected when interpreted as 32 bit code -- almost. Either something modified 03 to 01 or it was assembled differently then what I provide below. In any case, it might be a good idea to read the raw bytes in memory at the start and compare with them after. 0f 01 10 is lgdt [eax], in between the 2nd and 3rd instructions. It is also possible that something is changing the byte after startup but before you copy it. I can't address what if this is the case, I can only provide my observation.

Code: Select all

; 16: FA B80060 05000F 0310 F4 -- what the original code should be
; 32: FA B80060 05000F 0110 F4 -- 03 -> 01?
                     -lgdt- hlt

Octocontrabass · Post by **Octocontrabass** » Mon Feb 01, 2021 7:43 pm

Robert wrote:I mean after INIT the AP core run in 32-bit real mode,

...No, it runs in 16-bit real mode. Are you trying to run code assembled for 32-bit mode?

neon wrote:it matches almost perfectly to what would be expected when interpreted as 32 bit code -- almost.

Looks like a perfect match to me. You've got the operands backwards on one instruction: "add %dx,(%bx,%si)" = "add [bx+si], dx"

Robert · Post by **Robert** » Wed Feb 03, 2021 4:56 am

Octocontrabass wrote:
Robert wrote:I mean after INIT the AP core run in 32-bit real mode,
...No, it runs in 16-bit real mode. Are you trying to run code assembled for 32-bit mode?
...

Yes, I did. Then created a 16-bit initializer code and now it works fine.
Is there a well-defined difference between the 16, 32 and 64-bit modes of assemblers? This looks a little fuzzy for me, because I can easily use EAX in 16-bit code.
Anyway, thanks for the help to everyone.

nullplan · Post by **nullplan** » Wed Feb 03, 2021 10:26 am

Robert wrote:Is there a well-defined difference between the 16, 32 and 64-bit modes of assemblers? This looks a little fuzzy for me, because I can easily use EAX in 16-bit code.

The interpretation of the code bytes is different for all modes. The default operand sizes are different, as are the default address sizes. Both are overridable, but that also means that the meaning of the override prefixes is inverted. You have access to more and better addressing modes in 32-bit mode (i.e. you can use any register as a base, not just DI, SI, BP, and BX).

This means that the meaning of the same code bytes can differ quite dramatically between modes. In general, executing code in the wrong mode is not going to work.

neon · Post by **neon** » Wed Feb 03, 2021 10:29 am

Hi,

The operand size override prefix (0x66) selects the "non-default" option. Likewise the address-size override prefix (0x67) selects the "non-default" option. Operand size override differentiates between 16 bit and 32/64 bits. Address size override differentiates between the 16 bit ModRM addressing modes and 32 bit ModRM addressing modes with possible SIB byte. The REX prefix is 64 bit only and used to differentiate between 32 bit and 64 bit forms if present. There isn't a way to just look at the code and know how it should be interpreted: you need to know how it was originally assembled for (bits 16, bits 32, bits 64) to know the "defaults" which tell you how to interpret 0x66, 0x67, and the REX prefix (which means nothing in non-64 bit mode.)

Long story short, the 0x66 prefix is what allows you to use 32 bit registers in 16 bit real mode code and 0x67 allows you to use 32 bit scale-index-base addressing modes in 16 bit real mode code.

Octocontrabass · Post by **Octocontrabass** » Wed Feb 03, 2021 6:12 pm

Robert wrote:Is there a well-defined difference between the 16, 32 and 64-bit modes of assemblers?

The instructions have different binary encodings in each mode, so you must choose the correct assembler mode according to the CPU mode.

You can use exactly the same instructions and operands in both 16-bit and 32-bit mode. As far as the assembler is concerned, the only difference between these two modes is which bytes it generates.

In 64-bit mode, you can use some new instructions and operands, and some old ones aren't available anymore.

OSDev.org

Qemu disassembler's output changed

Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed

Re: Qemu disassembler's output changed