Page 1 of 1

GCC inline assembly problems

Posted: Sun Dec 23, 2012 8:27 am
by rdos
First problem. Try to inline with .byte and .int constructs with a single parameter.

Here is how it looks like in WASM assembly:

Code: Select all


UserGate Macro gate_nr
   db 67h
   db 9Ah
   dd gate_nr
   dw 3
            Endm
Seems trivial, doesn't it?

However, GCC cannot inline this without duplicating the code!

The obvious way to do it:

Code: Select all

#define UserGate(nr) \
  asm ( \
  ".byte 0x67\n\t"
  ".byte 0x9A\n\t"
  ".int %0\n\t"
  ".word 0x3\n\t"
  : \
  : "i" (nr) \
  );

int main()
{
    UserGate(0x123);
   return 0;
}
However, this doesn't work, and instead the linker complains about an undefined reference to $291 (the decimal equalent to 0x123). I also tested many other codes other than "i", but all of them either generate compile-time errors or the undefined reference error.

This isn't a big problem, since it is the 32-bit version, and I might just as well stick to OpenWatcom for that.

For a 64-bit version it look like this:

Code: Select all

#define UserGate(nr) \
  asm ( \
  "pushq %%r15\n\t"
  "movq %0, %%r15\n\t"
  "syscall\n\t"
  "popq %%r15\n\t"
  : \
  : "i" (nr) \
  );

int main()
{
    UserGate(0x123);
   return 0;
}
Even if this works, expanding the whole construct to an inline like this:

Code: Select all


static inline int MyAsm()
{
    int res;

   UserGate(0x123);
   asm("movl %0, %%eax" : "=r" : );

   return res;
}

int main()
{
   int rand = MyAsm();

   return 0;
}
The code isn't inlined, but rather looks like this:

Code: Select all


00000180e000021c <MyAsm>:
 180e000021c:   55                      push   %rbp
 180e000021d:   48 89 e5                mov    %rsp,%rbp
 180e0000220:   41 57                   push   %r15
 180e0000222:   49 c7 c7 14 03 00 00    mov    $0x314,%r15
 180e0000229:   0f 05                   syscall
 180e000022b:   41 5f                   pop    %r15
 180e000022d:   89 c0                   mov    %eax,%eax
 180e000022f:   89 45 fc                mov    %eax,-0x4(%rbp)
 180e0000232:   8b 45 fc                mov    -0x4(%rbp),%eax
 180e0000235:   5d                      pop    %rbp
 180e0000236:   c3                      retq

00000180e0000237 <main>:
 180e0000237:   55                      push   %rbp
 180e0000238:   48 89 e5                mov    %rsp,%rbp
 180e000023b:   48 83 ec 10             sub    $0x10,%rsp
 180e000023f:   b8 00 00 00 00          mov    $0x0,%eax
 180e0000244:   e8 d3 ff ff ff          callq  180e000021c <MyAsm>
 180e0000249:   89 45 fc                mov    %eax,-0x4(%rbp)
 180e000024c:   b8 00 00 00 00          mov    $0x0,%eax
 180e0000251:   c9                      leaveq
 180e0000252:   c3                      retq

I think it is safe to state that in the current state, OpenWatcom's inline assembly functionality is superior to GCCs.

Edit: Further reading gives a solution:

Code: Select all


inline int MyAsm() __attribute__ ((always_inline));

inline int MyAsm()
{
    int res;

   UserGate(0x123);
   asm("movl %0, %%eax" : "=r" : );

   return res;
}

int main()
{
   int rand = MyAsm();

   return 0;
}
Resulting code:

Code: Select all

00000180e0000237 <main>:
 180e0000237:   55                      push   %rbp
 180e0000238:   48 89 e5                mov    %rsp,%rbp
 180e000023b:   41 57                   push   %r15
 180e000023d:   49 c7 c7 14 03 00 00    mov    $0x314,%r15
 180e0000244:   0f 05                   syscall
 180e0000246:   41 5f                   pop    %r15
 180e0000248:   89 c0                   mov    %eax,%eax
 180e000024a:   89 45 f8                mov    %eax,-0x8(%rbp)
 180e000024d:   8b 45 f8                mov    -0x8(%rbp),%eax
 180e0000250:   89 45 fc                mov    %eax,-0x4(%rbp)
 180e0000253:   b8 00 00 00 00          mov    $0x0,%eax
 180e0000258:   5d                      pop    %rbp
 180e0000259:   c3                      retq
This solution should be possible to apply to RDOS syscalls with minor modifications to header files.

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 9:00 am
by bluemoon
note that in your code at least eax/rax is clobbered and you didn't tell gcc; this may become an issue.

By the way, the inline keyword is just a hint, it's not enforced. Have you tried -O2?

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 10:11 am
by rdos
bluemoon wrote:note that in your code at least eax/rax is clobbered and you didn't tell gcc; this may become an issue.
Possibly. There is also another issue that I just discovered. The code is both inlined, and compiled as a separate procedure. This would provide real code-bloat! :evil:
bluemoon wrote:By the way, the inline keyword is just a hint, it's not enforced. Have you tried -O2?
These should always be inlined. There is no reason why it would not be.

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 10:30 am
by jnc100
rdos wrote:There is also another issue that I just discovered. The code is both inlined, and compiled as a separate procedure. This would provide real code-bloat! :evil:
If your function is not declared as 'static' then it should be accessible from outside the current module. GCC has no reason to know whether it is or isn't therefore it will also output a non-inline version for code that may call it externally.
bluemoon wrote:These should always be inlined. There is no reason why it would not be.
The C standard defines the 'inline' keyword as a suggestion only. Without any optimization options specified the default for GCC is -O0 i.e. all optimizations turned off. This provides the fastest compilation times and machine code that most closely resembles the input C file which is useful, e.g. for debugging purposes. For example imagine you are single stepping through your main() function at source code level. When you reach the line 'int rand = MyAsm()' you'd expect a function call to be outputted, which is exactly what GCC does. Trying to debug an optimized program is much harder and often makes source code debugging almost impossible.

Regards,
John.

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 11:10 am
by rdos
jnc100 wrote:
rdos wrote:There is also another issue that I just discovered. The code is both inlined, and compiled as a separate procedure. This would provide real code-bloat! :evil:
If your function is not declared as 'static' then it should be accessible from outside the current module. GCC has no reason to know whether it is or isn't therefore it will also output a non-inline version for code that may call it externally.
bluemoon wrote:These should always be inlined. There is no reason why it would not be.
The C standard defines the 'inline' keyword as a suggestion only. Without any optimization options specified the default for GCC is -O0 i.e. all optimizations turned off. This provides the fastest compilation times and machine code that most closely resembles the input C file which is useful, e.g. for debugging purposes. For example imagine you are single stepping through your main() function at source code level. When you reach the line 'int rand = MyAsm()' you'd expect a function call to be outputted, which is exactly what GCC does. Trying to debug an optimized program is much harder and often makes source code debugging almost impossible.

Regards,
John.
Well, either you want to debug the syscall itself (in which case you'd switch to assembly-mode), or the debugger should execute the entire inline as an unit. When debugging the assembly-code, the debugger should normally step over the "syscall" instruction, but a traceo on it should trace into kernel-mode. When the code is put into a procedure, there will be two calls (a callq and an syscall) which really is not necessary.

Anyway, I've found one way to implement it that works:

Code: Select all


#define UserGate(nr) \
  asm ( \
    "pushq %%r15\n\t" \
    "movq %0, %%r15\n\t" \
    "syscall\n\t" \
    "popq %%r15\n\t" \
    :: "i" (nr) \
  );

static inline __attribute ((always_inline)) long RdosGetLongRandom()
{
    int res;
    UserGate(user_gate_random);
    asm ("movl %0, %%eax" : "=r" (res) : );
    return res;
}
This solution implies that the extern definitions must be removed, which means the compiler will not be able to consistency-check the definitions.

It also adds an extra mov eax,eax. Maybe it is possible to return the function return value directly? Skipping the return res and asm macro will work, but it would produce compile-time warnings/errors.

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 11:23 am
by rdos
This is even better:

Code: Select all

#define UserGate(nr) \
  asm ( \
    "pushq %%r15\n\t" \
    "movq %0, %%r15\n\t" \
    "syscall\n\t" \
    "popq %%r15\n\t" \
    :: "i" (nr) \
  );

#define UserGateRet(nr, ret) \
  asm ( \
    "pushq %%r15\n\t" \
    "movq %0, %%r15\n\t" \
    "syscall\n\t" \
    "popq %%r15\n\t" \
    : "=a" (ret) : "i" (nr) \
  );

static inline __attribute ((always_inline)) long RdosGetLongRandom()
{
    int res;
    UserGateRet(user_gate_random, res);
    return res;
}
Which generates this code:

Code: Select all

00000180e000021c <main>:
 180e000021c:   55                      push   %rbp
 180e000021d:   48 89 e5                mov    %rsp,%rbp
 180e0000220:   41 57                   push   %r15
 180e0000222:   49 89 c7                mov    %rax,%r15
 180e0000225:   0f 05                   syscall
 180e0000227:   41 5f                   pop    %r15
 180e0000229:   48 89 45 f0             mov    %rax,-0x10(%rbp)
 180e000022d:   48 8b 45 f0             mov    -0x10(%rbp),%rax
 180e0000231:   48 89 45 f8             mov    %rax,-0x8(%rbp)
 180e0000235:   b8 00 00 00 00          mov    $0x0,%eax
 180e000023a:   5d                      pop    %rbp
 180e000023b:   c3                      retq
There is still an unnecessary move, but other than that it seems ok.

With -O2 (even with volatile keyword for inline), GCC removes the whole call! I suppose I need some other flags as well (at least for syscalls with side-effects).

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 3:45 pm
by xenos
I guess you need to replace "asm" with "asm volatile" to keep gcc from removing the assembly completely.

Re: GCC inline assembly problems

Posted: Sun Dec 23, 2012 5:38 pm
by Owen

Code: Select all

#define UserGateRet(nr, ret) do {\
    register int nr_ asm("r15") = nr;
    asm("syscall" : "=a" (ret) : "r" (nr_) : "memory"); \
} while(false)

ASM register variables guarantee that the value of that variable will be in that register in any asm block you pass it to with the "r" specifier. The "memory" clobber (you may wish to look up the other clobbers for other cases) tells GCC that your inline assembly has ``undefined'' side effects, like any other function, and prevents it from being optimized out.

GCC inline assembly is pretty much capable of any crazy combination you can think of (though I have once got to the point where the register allocator gave up because I was touching all 15 general purpose registers)