Brendan wrote:First, I'd deliberately avoid MASM (and TASM when used in "MASM mode"), and any other assembler that exhibits context sensitive behaviour (A86/A386, WASM, JWASM). For example, for something like "mov eax,foo" MASM may generate different instructions depending on how "foo" was defined.
That is a problem that is easily handled with coding-rules. Whenever a memory location is referenced, ALWAYS add the segment-register used in addressing. That would be "mov eax,ds:foo", which the assembler always genererates the same code for. When the intention is to reference foo as an offset, ALWAYS use the "offset" operator, which would be "mov eax,OFFSET foo". Never, ever, rely on "assume" directives. If foo is a numerical value, define it with "=" and use it as a constant like this: "mov eax,foo". Once these coding rules are followed, there is no problem with MASM/TASM/WASM mode.
Brendan wrote:If you're using inline assembly you have to use whatever makes your compiler happy. Given the "avoid MASM" restriction, this may mean switching to a different compiler in some cases (but you can't port a closed source compiler to your OS later anyway, so switching to something like GCC or CLANG would probably be an advantage anyway).
Not possible. I'm not aware of any non-MASM inline syntax C compiler that can handle segmentation.
Brendan wrote:TASM is not open source and you can't port it to your OS later, so strike it off your list.
WASM is, and it is largely MASM/TASM compatible.
Brendan wrote:That leaves NASM and YASM (in "Intel syntax" mode). These both use exactly the same pre-processor, the same syntax, etc; and you can freely switch between them with little or no problems at all. This is a good thing (it means your code isn't locked in to a specific assembler and you won't be screwed if there's ever any problem and you want to switch).
How about switching MASM/TASM/WASM compatible code to NASM? Previously, this was no possible since when using the above syntax, NASM would misinterpret it as "mov eax,OFFSET foo", which in many cases was not correct. Adding the above rules does not help much in porting either.
I would have prefered a new syntax, where only constants defined with "=" could be referenced with "mov eax,foo", and where all references to a structure member / EQU define / segment variable would require either explicit use of a segment-register when referencing it as a variable, or would require offset references with "OFFSET" operator. Adding brackets (NASM-style) could be an option, but segment-register should still be required. Even simple things like "mov eax,[ebx]" should be invalid and should be "mov eax,ds:[ebx]. Then the assume operator would have to go, as it would serve no function. Code written for this syntax would be clean and portable.
Another construct has to do with string operations. It would be required to always give segment-registers like this: "movs dword ptr es:[edi],fs:[esi]". Some assemblers (notably WASM) doesn't check the operands properly if not done like this, and might produce unexpected code.
The assembler would know itself when the segment override is default behavior, and would then not generate an override. Coders should not need to bother with that issue.