Possible inaccuracy of Nasm

PeterX · Post by **PeterX** » Sun Aug 23, 2020 7:41 am

I witnessed some people using

mov ds, ax

and some using

Code: Select all

mov ds, eax

which is rather confusing for me because I think only one version should work.
But when I assembled (with Nasm) and dissassembled the above instructions, they both are the same (both in real mode and in pmode)!
Isn't that an inaccuracy of Nasm?

Greetings
Peter

iansjack · Post by **iansjack** » Sun Aug 23, 2020 9:16 am

When operating in 32-bit mode and moving data between a segment register and a general-purpose register, the 32-bit IA-32 processors do not require the use of the 16-bit operand-size prefix (a byte with the value 66H) with this instruction, but most assemblers will insert it if the standard form of the instruction is used (for example, MOV DS, AX). The processor will execute this instruction correctly, but it will usually require an extra clock. With most assemblers, using the instruction form MOV DS, EAX will avoid this unneeded 66H prefix.

It sounds to me as if Nasm is actually more accurate than "most assemblers".

fpissarra · Post by **fpissarra** » Sun Aug 23, 2020 9:24 am

PeterX wrote:But when I assembled (with Nasm) and dissassembled the above instructions, they both are the same (both in real mode and in pmode)!
Isn't that an inaccuracy of Nasm?

Try to compile using -O0 option (no optimization). You'll see that 'mov ds,ax' will use a 0x66 prefix on 32 or 64 bits targets.

You are right about the instruction `mov ds,eax` not existing in the official ISA from Intel, but `mov ds,rax` does. So, I think NASM allows `mov ds,eax` as an extension.

Any of these instructions will use only the lower 16 bits of GPRs.

fpissarra · Post by **fpissarra** » Sun Aug 23, 2020 9:29 am

By the way... NASM allows something like this, as well:

Code: Select all

lea eax,[5*rbx]

It'll be tranlated to `lea eax,[rbx+4*rbx]`.

This is clearly not a standard effective address calculation, but it works well on NASM.
Of course, `6*rbx` will not work.

PeterX · Post by **PeterX** » Sun Aug 23, 2020 10:42 am

fpissarra wrote:Try to compile using -O0 option (no optimization). You'll see that 'mov ds,ax' will use a 0x66 prefix on 32 or 64 bits targets.

You are right about the instruction `mov ds,eax` not existing in the official ISA from Intel, but `mov ds,rax` does. So, I think NASM allows `mov ds,eax` as an extension.

Any of these instructions will use only the lower 16 bits of GPRs.

Thanks. That makes sense.

Greetings
Peter

sj95126 · Post by **sj95126** » Sun Aug 23, 2020 11:25 am

fpissarra wrote:You are right about the instruction `mov ds,eax` not existing in the official ISA from Intel, but `mov ds,rax` does.

It's there, but they did a poor job of explaining it. On the page for MOV, the "mov Sreg, r/m16" syntax is marked with **, which is explained as:

"** In 32-bit mode, the assembler may insert the 16-bit operand-size prefix with this instruction"

That implies that without the operand-size prefix, the equivalent is "mov Sreg, r/m32". Normally they would put such an example in the table, and the opcodes would make it clear that "mov ds, ax" in 16-bit mode is the same opcode as "mov ds, eax" in 32-bit mode, as is the case for many 16-bit/32-bit instructions.

Octocontrabass · Post by **Octocontrabass** » Sun Aug 23, 2020 1:48 pm

MOV to a segment register is always a 16-bit operation, regardless of the operand size. AMD's documentation is somewhat more clear in this regard.

The confusion in Intel's manual might be the result of trying to reconcile years of conflicting documentation without describing officially undefined behavior for MOV from a segment register. On a 386, 486, or Pentium, MOV from a segment register is always a 16-bit operation, ignoring the operand size. On a Pentium Pro or later, it ignores the operand size when the destination is memory, but obeys the operand size and zero-extends the value when the destination is a register. At some point (probably around when the Pentium was released) Intel decided to document that the upper bytes of a register destination were undefined so that software wouldn't rely on them being unchanged by the 386/486/Pentium and break when running on the Pentium Pro.

OSDev.org

Possible inaccuracy of Nasm

Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm

Re: Possible inaccuracy of Nasm