ASM Syntax in the OS FAQ...

Solar · Post by **Solar** » Tue Jul 20, 2004 4:16 am

There are several pages in the OS FAQ that provide ASM examples. Right now, both NASM (Intel) syntax and GAS (AT&T) syntax is used.

NASM is the more popular assembler, mostly because "everybody uses it" and because it's said to have better high-level features (which don't matter for our examples).

(mov eax, value)

GAS is the assembler that is present in any GCC toolchain anyway, which can be easily inlined with C/C++ source code. The AT&T syntax is more verbose than Intel syntax, adding qualifiers to distinguish values, registers, addresses etc., and more familiar to people coming from other CPUs, but usually considered "ugly" by NASM coders.

(mov $value, %eax)

The question is, should we strive to make our ASM examples consistent, and if yes, which syntax to use.

Curufir · Post by **Curufir** » Tue Jul 20, 2004 8:45 am

Well my vote would be for GAS if we're going to be consistent.

Basically to try to reduce the amount of x86-centric stuff in the FAQ. That being said, most of the code examples are for x86-specific stuff (Hardware, architecture etc) so it's not a huge deal, but someone programming on a different processor is going to be more comfortable not being faced with Intel syntax.

Pype.Clicker · Post by **Pype.Clicker** » Tue Jul 20, 2004 9:20 am

hmm, imho the FAQ is clearly x86-oriented! GRUB is x86-only (afaik), we're talking about BIOS everywhere, we explain how PIC etc. can be remapped ...

that'd make me rather in favor of the intel syntax with NASM directive, as it's the syntax people have discovered with intel manuals, Assembler-In-The-Pocket books, and even 'Art of Assembly' (well, sort of) ...

Can you tell how [200+eax+ecx*4] should be translated in AT&T ?
i can't.

Ideally, the FAQ would contain both version and the user could filter the one he prefers ... but it would take us quite far away from a Wiki, wouldn't it ? ...

Curufir · Post by **Curufir** » Tue Jul 20, 2004 10:38 am

Can't deny that you've got a point Pype, but if they're going to be using GCC then they're going to have to face the GAS problem sooner or later anyhow.

That would be easier with some solid examples (And excessive commenting) to look at.

jinksys · Post by **jinksys** » Tue Jul 20, 2004 4:20 pm

Pype.Clicker wrote:
Can you tell how [200+eax+ecx*4] should be translated in AT&T ?
i can't.

[200+eax+ecx*4] -> 200(%eax,%ecx,4)

Only thing different about at&t indirect memory references is that the
displacement is outside the parentheses, and that you use commas
instead of pluses.

at&t syntax is all I use, even though I learned asm through intel syntax,
and like it alot more.

Solar · Post by **Solar** » Wed Jul 21, 2004 12:38 am

Pype.Clicker wrote: we're talking about BIOS everywhere, we explain how PIC etc. can be remapped ...

We also talk about InlineAssembly, and about using objdump to debug the kernel. Both of which inadvertently confronts people with AT&T syntax...

that'd make me rather in favor of the intel syntax with NASM directive, as it's the syntax people have discovered with intel manuals, Assembler-In-The-Pocket books, and even 'Art of Assembly' (well, sort of) ...

...because AT&T tutorials and examples are few and far between, and because of that I'd prefer that syntax in the Wiki because sooner or later they will have to handle it anyway.

Can you tell how [200+eax+ecx*4] should be translated in AT&T ?
i can't.

You tell me what's that supposed to mean (since I can't really read Intel syntax) and I'll tell you.

Ideally, the FAQ would contain both version and the user could filter the one he prefers ... but it would take us quite far away from a Wiki, wouldn't it ? ...

Hmm... doesn't sound like that bad an idea, given that we can't get around AT&T completely, anyway. That would probably be even better, because it would give people lots of AT&T <-> NASM references.

The problem is, I failed to come up with any decent way of getting a side-by-side layout, and without that, giving both examples could get... messy. :-[

jinksys · Post by **jinksys** » Wed Jul 21, 2004 1:33 am

Even though I use at&t syntax, I think the Faqs should use intel syntax.
The intel books are in intel syntax, and most people learn assembly
nowadays using intel syntax. If someone can read at&t syntax, they
should have no problem reading Faqs in intel syntax.

bkilgore · Post by **bkilgore** » Wed Jul 21, 2004 10:30 am

I dont think thats any more true than saying, if you can read Intel syntax you can read at&t syntax. Personally, as all my assembly from now on is basically inline, I prefer at&t syntax, although ideally side-by-side or both in some other way would be best.

Dreamsmith · Post by **Dreamsmith** » Wed Jul 21, 2004 4:12 pm

Solar wrote:The question is, should we strive to make our ASM examples consistent, and if yes, which syntax to use.

I voted "Don't care, ..." because I can use either and, honestly, I really don't care, I just need to know which format the tool I'm using is expecting, I'm fine giving it whatever it wants.

I'm not sure about that "as long as it's consistent" part, though. Given the popularity of Intel syntax in the literature, especially Intel's reference manuals, you have to cover Intel syntax. Given the fact that inline code in GCC must be AT&T syntax, you have to cover that as well. I think you're doomed to including both.

Of course, that doesn't mean you need to be inconsistent. For snippets of code consisting of multiple lines, like it's out of an .asm file, you can consistently use Intel syntax, and for short bits of code you're likely to inline rather than call, use AT&T syntax. Just make sure you clearly mention that each time, with a link to a page explaining the differences and how to translate from one to the other.

Candy · Post by **Candy** » Sun Jul 25, 2004 10:27 am

Solar wrote:
Can you tell how [200+eax+ecx*4] should be translated in AT&T ?
You tell me what's that supposed to mean (since I can't really read Intel syntax) and I'll tell you.

It means, take the value from ecx, multiply it by 4 (ecx * 4), add eax to it (eax+ecx*4) and add 200 to that sum (200+eax+ecx*4), and then view that number as an address and take the contents of it ( [200+eax+ecx*4] ). How is 200(eax,ecx,4) logical, and if you would, what would (eax, -1) mean in AT&T syntax? I can't seem to figure out what it would be.

Solar · Post by **Solar** » Sun Jul 25, 2004 11:35 am

Candy wrote: It means, take the value from ecx, multiply it by 4 (ecx * 4), add eax to it (eax+ecx*4) and add 200 to that sum (200+eax+ecx*4), and then view that number as an address and take the contents of it ( [200+eax+ecx*4] ).

Ah, indexed addressing!?

How is 200(eax,ecx,4) logical...

Well, AT&T added an extra syntax for indexed addressing, giving each of the parameters a fixed place. I'm rather sure you could use the Intel notation just the same...

...and if you would, what would (eax, -1) mean in AT&T syntax?

Dunno, what does it mean in Intel syntax?
</rhethorical question>

I can't seem to figure out what it would be.

See?!?

It boils down to a matter of preference, but for either "camp" it's pretty hard to understand the syntax of the other "camp", unless you're fluent in both. Thus, an ASM example in the "wrong" syntax can keep you quite busy for some time trying to figure out what it means - unless, of course, it's liberally commented, something not that popular among "real" programmers...

Candy · Post by **Candy** » Sun Jul 25, 2004 1:16 pm

Solar wrote:
...and if you would, what would (eax, -1) mean in AT&T syntax?
Dunno, what does it mean in Intel syntax?
</rhethorical question>

I can't seem to figure out what it would be.
See?!?

It boils down to a matter of preference, but for either "camp" it's pretty hard to understand the syntax of the other "camp", unless you're fluent in both. Thus, an ASM example in the "wrong" syntax can keep you quite busy for some time trying to figure out what it means - unless, of course, it's liberally commented, something not that popular among "real" programmers...

The point with this is that it's AT&T syntax and that I really don't know what it would be. It's invalid viewed as intel syntax. Somebody here asked what it meant and nobody knew, haven't seen that with Intel syntax (as long as you know the square brackets you're all set, just use common mathematical sense). Can't seem to get that simplicity out of AT&T syntax.

Dreamsmith · Post by **Dreamsmith** » Sun Jul 25, 2004 4:36 pm

Candy wrote:Somebody here asked what it meant and nobody knew, haven't seen that with Intel syntax (as long as you know the square brackets you're all set, just use common mathematical sense). Can't seem to get that simplicity out of AT&T syntax.

I really don't care which way people go, but I must admit as someone who uses both syntaxes frequently, Intel syntax is a lot simpler. With AT&T syntax, you do have that formatting/position stuff to remember, whereas anyone who's passed the 9th grade (one assumes people still learn Algebra by then if not earlier) can't honestly have much difficultly figuring out something like "eax+ecx*4+200". If you didn't understand what that meant, you must be going out of your way to be intentionally dense. "200(eax,ecx,4)" is utterly obscure to anyone not familiar with AT&T syntax, however. What's being multiplied by what? What's being simply added? There are absolutely no clues provided by this syntax. Intel syntax is definitely friendlier to anyone who at least understands high school level mathematics. AT&T syntax is just as readable once you know it, but AT&T syntax has a much steeper learning curve.

Solar · Post by **Solar** » Sun Jul 25, 2004 5:28 pm

That's not to the point. I won't even argue that AT&T syntax has some intricacies.

I could argue that AT&T has the operands in the "right" order, that the additional qualifiers ($, %, operand length etc.) make for more "expressiveness" and that you can nicely inline it.

But I won't. I just say that you will encounter AT&T syntax when disassembling your kernel using objdump, and that you can use AT&T source examples without having to install / handle another tool...

...and that, if anything, this argument here showed that we should figure out some way to have examples in both syntax, side by side. It's just that I can't think of a good one, yet...

df · Post by df » Sun Jul 25, 2004 6:36 pm

I would like a standardisation ;D that would be nice. (c standardisation, asm standard, guidelines etc etc etc)..

- most people coming into osdev have no asm experience
from that point, it doesnt matter what we choose.

- anyone knowing any ASM coming from a non intel background will have at&t syntax
- anyone from an intel background will 99% of the time have intel syntax.
your going to upset someone no matter what.

- downloading tools shouldnt be an issue, nasm is what 600kb? unix distros have it on cd. if someone is going to download gcc/binutils whats adding nasm to it?
- other hand, gas/as comes with gcc.
(personally, i never considered gas/as a full fledged assembler but that is beside the point).

if someone has watcom, they will want WASM source. someone will want MASM, and someone TASM from borland compiler side.

from reading it, i think its easier to read intel syntax over at&t syntax. but there is those who will disagree with it.

it seems these days, anyone doing osdev stuff on x86, glosses over the asm needed. you can do most everything in C and just do an inline LTR or LIDT etc.

from that point maybe we should just minimise the use of asm in the faq?

OSDev.org

ASM Syntax in the OS FAQ...

ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...

Re:ASM Syntax in the OS FAQ...