Intel syntax with GAS?

All about the OSDev Wiki. Discussions about the organization and general structure of articles and how to use the wiki. Request changes here if you don't know how to use the wiki.
Post Reply
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Intel syntax with GAS?

Post by moonchild »

The wiki article on GAS recommends not using GAS's intel syntax. It gives two reasons:
  1. Even in intel mode, operands are reversed in some cases
  2. Assembly produced may be suboptimal
1 is certainly true, but only applicable to a small number of x87 instructions which should be avoided anyway. 2 seems to have been fixed; at least, I was unable to reproduce it:

Code: Select all

$ cat testatt.s
.code16     ; NB. I get the same results in 64-bit mode
f1:
mov $5, %si
mov %di, %si
mov %ax, %si
movsx %al, %si
movsb (%si), (%di)
$ cat testintel.s
.intel_syntax noprefix
.code16
f1:
mov si, 5
mov si, di
mov si, ax
movsx si, al
movsb [di], [si]
$ as testatt.s -o testatt.o && as testintel.s -o testintel.o
$ diff test{att,intel}.o && echo same
same
Should the wiki page be updated? Are there any other reasons people know of for avoiding GAS's intel syntax? It's worked great for me, personally.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Intel syntax with GAS?

Post by Combuster »

Your average bootloader example contains something like

Code: Select all

jmp far 0x0008:0x00010000
On my system's binutils 2.33.1, that still gives

Code: Select all

astest.s:4: Error: junk `0x0008:0x00010000' after expression
Have fun figuring out the correct syntax if you don't know it already. Hint: it's not in the intel or amd manuals :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
nexos
Member
Member
Posts: 1079
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Intel syntax with GAS?

Post by nexos »

It may be hard enough to figure out GAS syntax to begin with! Luckily, I found a document on the net (forget where) that saved me a few weeks ago when I switched to GAS.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
crosssans
Member
Member
Posts: 39
Joined: Fri Mar 01, 2019 3:50 pm
Location: France

Re: Intel syntax with GAS?

Post by crosssans »

Combuster wrote:Your average bootloader example contains something like

Code: Select all

jmp far 0x0008:0x00010000
On my system's binutils 2.33.1, that still gives

Code: Select all

astest.s:4: Error: junk `0x0008:0x00010000' after expression
Have fun figuring out the correct syntax if you don't know it already. Hint: it's not in the intel or amd manuals :wink:
It is indeed tricky to find equivalents of the Intel syntax while using GAS, that's why I'm going to take the opportunity to give the equivalent of your given instruction here so that people don't have to figure out the hard way (as I did!) :P

Code: Select all

ljmp $0x8, $0x10000
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Intel syntax with GAS?

Post by vvaltchev »

That's an interesting topic. When I started Tilck, I used nasm, because I like the Intel syntax and because nasm it's a great assembler. The problem came when I wanted to include header files in both C files and assembly files: I wanted #ifdefs and #defines to work in both places. To do that, I had to manually run the pre-processor on nasm files and have an additional step. Too much complexity. So I decided to switch to GAS: when .S files are passed to gcc, it runs the preprocessor and then GAS. In addition to that, using GAS meant dropping an extra dependency (nasm) from my project, which is always good.

Now, returning to the topic, I wanted GAS, but I wanted the Intel syntax too. I cannot say that wasn't completely painless going from nasm to gas + intel syntax, but with minor fixes here and there, it worked. And yeah, about the far jump, in my "legacy" bootloader I have code like:

Code: Select all

enter_32bit_protected_mode:
   cli
   lidt [idtr]
   lgdt [gdtr_base_X]
   mov eax, cr0
   or al, 1
   mov cr0, eax
   jmp 0x08:complete_flush

.code32
complete_flush:
   lgdt [gdtr_flat]
   jmp 0x08:(BL_ST2_DATA_SEG * 16 + complete_flush2)
complete_flush2:
It works great and it even evaluates literal expressions, that's another feature that I use a lot.
In some cases, I couldn't do a far jump with a register, so I had to use retf (see my comments below):

Code: Select all

   push 0x08
   push esi
   retf
The fancier thing I had to write was jumping to a 32-bit segment while being in long mode, in order to enter in the "32-bit compatibility mode" (a needed step in order to enter in PM-32 from long mode):

Code: Select all

   lea rdx, [rip + compat32]
   push 0x08
   push rdx
   rex.W retf # perform a 32-bit far ret (=> far jmp to 0x08:compat32)
When I wrote it at the time, I found no better way. I'm not sure if nasm could have helped skipping the rex.W prefix or the retf. By taking a quick look now at the table at: https://c9x.me/x86/html/file_module_x86_id_147.html, I believe that x86 simply doesn't support a FAR indirect jump with a register. If anyone has a better alternative for the "retf" instructions above, I'd be happy to update my code. But, for the moment, I don't believe it's a GAS limitation.

Still, said that, I didn't know that GAS might generate inferior machine code when the Intel syntax is used. Does it still apply to GCC 7.x ? Are there any historic bugs I can look at? I hope the problem does not exist anymore, because the Intel syntax is *so clean* and using the same GCC toolchain is extremely convenient. Nasm is better per-se but, for the reasons I've explained, remaining in the same toolchain is much more convenient, for me.

Vlad
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
Octocontrabass
Member
Member
Posts: 5531
Joined: Mon Mar 25, 2013 7:01 pm

Re: Intel syntax with GAS?

Post by Octocontrabass »

vvaltchev wrote:I'm not sure if nasm could have helped skipping the rex.W prefix or the retf.
NASM will accept "retfq" in place of "rex.w retf".
vvaltchev wrote:If anyone has a better alternative for the "retf" instructions above, I'd be happy to update my code.
Store the destination far pointer in memory and use a far JMP with a memory operand. Using RET without a corresponding CALL can mess up the branch prediction.

Whether or not that's actually better depends on what metric you're using.
vvaltchev wrote:Intel syntax is *so clean*
NASM syntax is clean. Intel syntax is ambiguous.

In NASM syntax, a symbolic operand without brackets around it is an immediate operand. In Intel syntax, a symbolic operand without brackets around it may be either an immediate operand or a memory operand, and you have to look at how that symbol is defined to know which one it will be.
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Intel syntax with GAS?

Post by vvaltchev »

Octocontrabass wrote:NASM will accept "retfq" in place of "rex.w retf".
Ah, thanks! Also, I just noticed that the comment near the "rex.w retf" is just wrong :-)
Octocontrabass wrote:Store the destination far pointer in memory and use a far JMP with a memory operand. Using RET without a corresponding CALL can mess up the branch prediction.
Yeah, I thought about that.. I just preferred avoiding it. Maybe is better, but I have mixed feelings. Performance it's not important as it happens just once when switching to PM32 before jumping to the kernel. But I totally agree with your argument.
Octocontrabass wrote:
vvaltchev wrote:Intel syntax is *so clean*
NASM syntax is clean. Intel syntax is ambiguous.

In NASM syntax, a symbolic operand without brackets around it is an immediate operand. In Intel syntax, a symbolic operand without brackets around it may be either an immediate operand or a memory operand, and you have to look at how that symbol is defined to know which one it will be.
You're totally right. With "intel" I meant the nasm/intel syntax compared to AT&T. But yeah, when compared to Intel, the NASM dialect is much better. I remember that I hit myself the ambiguities you're talking about.
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
Post Reply