Octocontrabass wrote:Long mode does support call gates.
That's not my impression. You cannot use a protected mode call gate to get from 32-bit user mode to the kernel entry point that is defined in the gate when the processor is in long mode. This only works if the processor is in protected mode.
Octocontrabass wrote:What it doesn't support is direct far calls - you would have to use indirect ones instead.
Yes, because far calls don't switch ring.
I don't think so. By using syscall or sysentry I will need another far call in the kernel to get to the correct server routine. I will need to push and pop a segment register (ds) and load ds with the call table segment. I need to inspect the user mode call stack to get the syscall number, and the caller will need to push it on the stack. The server routine returns with a far return, adding yet another segment register load. This method requires two segment register loads on entry, two for saving & restoring a segment register, one for reading the user mode stack, one for getting the server address, one for calling the server routine, one for exiting the server routine and two for returning to user mode. That's a total of 10 segment register loads.
A call gate has two segment register loads on entry (ss and cs) and two segment register loads on exit (ss and cs). There is no other logic needed, rather the server routine is called directly.
In summary, I'm convinced that call gates are faster with older processors, but probably also with more modern. Needing twice as many segment register loads and a dozen or so additional operations cannot be faster, even if syscall / sysenter has some optimizations.
Edit: There actually is more to it. SYSCALL is long mode only, and so cannot be used in protected mode. SYSENTER setup a flat CS & SS in kernel, and won't save neither the user mode cs nor the user mode ss. This means that the caller must save CS & SS on the user mode stack and on return these must be restored. The kernel stack must also be loaded from the thread control block, which requires an additional three or so segment register loads. Since sysleave uses ECX & EDX to return, these must also be saved on the user mode stack and loaded prior to calling the server routine. After the call, ECX & EDX must be saved on the user mode stack, adding another three segment register loads. So, the final result is 16 vs 4 segment register loads.
Octocontrabass wrote:rdos wrote:The problem in relation to GCC is that parameters must be passed in registers, and so I need to define function prototypes that load the correct registers and then do the syscall.
I'm not sure I understand the problem here. GCC allows you to pass parameters in registers to inline assembly.
The problem is that this is not standardized and so GCC and OW need different code. Two versions to keep up to date rather than only one.