Page 1 of 1

Outstripping the GCC Inline Compiler: inb/outb functions.

Posted: Thu Sep 11, 2003 9:16 pm
by stonedzealot
So, I got curious about how the GCC inline ASM compiler did so I had it compile two simple inline ASM functions: outportb and inportb. They look like this in C:

Code: Select all

void outportb(unsigned port, unsigned val)

{

   __asm__ __volatile__("outb %b0,%w1"

      :

      : "a"(val), "d"(port));

}
and

Code: Select all

unsigned inportb(unsigned short port)

{

   unsigned char ret_val;



   __asm__ __volatile__("inb %1,%0"

      : "=a"(ret_val)

      : "d"(port));

   return ret_val;

}
Basic enough. Then I disassembled them into their GNU ASM counterparts and tried to translate them. First, outportb. The disassembled code is pretty straight forward, as is the NASM translation next to it:

Code: Select all

push %ebp                      push ebp
mov %esp, %ebp            mov ebp, esp
mov 0x8(%ebp),%edx     mov edx, [ebp + 8]
mov 0xc(%ebp),%eax      mov eax, [ebp + 12]
out %al, (%dx)                 out dx, al
pop %ebp                        pop ebp
ret                                    ret
The code for the inportb function is quite different though, and I'd like to ask some questions. Here's the disassembled code:

Code: Select all

push %ebp
mov %esp, %ebp
sub $0x4,%esp <-------------------------(1)      
mov 0x8(%ebp),%eax
mov %ax,0xfffffffe(%ebp) <------------(2)
movzwl 0xfffffffe(%ebp), %edx 
in (%dx), %al
mov %al, 0xfffffffd(%ebp)<--------------(3)
movzbl 0xffffffd(%ebp),%eax
leave
ret
(1) Flat out, why in the hell is this here? I could see if it was the addition of four to ESP, but not subtraction...

(2) This line and the next...doesn't that just simplify into mov edx, ax or xor edx, edx; mov dx, ax?

(3) Same problem with the last, doesn't this just expand al into the eax. (the same thing that would be achieved if you cleared eax right before the in call)?

Re:Outstripping the GCC Inline Compiler: inb/outb functions.

Posted: Fri Sep 12, 2003 12:37 am
by Solar
Answer (1): Stack grows downwards.

Re:Outstripping the GCC Inline Compiler: inb/outb functions.

Posted: Fri Sep 12, 2003 1:33 am
by Pype.Clicker
Well, from the code you obtained, i guess you did not turn the -Optimizer on ... try recompiling with -O3 if you want GCC to output something smart ...

(1.) the sub esp,4 is intended to allocate a place for the "ret" variable.

(2.) as you have "%1":"d"(port), GCC believes the *whole* value of edx must be prepared, so it zeroes the high part. using "%w1" should fix this.

(3.) as you said GCC that the output was an unsigned and al is just a byte, it goes through an additionnal mozb which will zero the high part of eax. saying that inb returns an unsigned char should fix this.

Re:Outstripping the GCC Inline Compiler: inb/outb functions.

Posted: Fri Sep 12, 2003 2:26 am
by stonedzealot
wow. look at that. All that wasted CPU time in a seemingly simple function. This is why I'm not doing this inline...it's dangerous (and a little weird). Anyway, thanks again Pype.

Solar: Yes...but your response still doesn't tell why it's allocating space...oh well. It's all handled now.

Re:Outstripping the GCC Inline Compiler: inb/outb functions.

Posted: Fri Sep 12, 2003 2:36 am
by Pype.Clicker
well, if the purpose of your code was to have inline function that would just emit an "out %eax, %dx", looking at the code generated for the function itself will not really help.

1. you must declare inb and outb as "static inline void outb(word port, byte val)"
2. you must enable the optimizer, so that inline functions are actually inlined.
3. you may use "Nd"(port) rather than "d"(port), which will allow GCC to emut "outb %al, $0x20" aswell.

Now, as Tim would say, it will not really speed up your code (because in/out have huuuge latencies compared to other instructions, but it can make your code smaller ...