Syscalls versus Call Gates

Jeko · Post by **Jeko** » Wed Jul 16, 2008 3:32 am

Which are pros and cons of syscalls and pros and cons of call gates? Which is the fastest method to handle user to kernel interfacing?

Solar · Post by **Solar** » Wed Jul 16, 2008 4:53 am

Wikipedia writes about Call Gates:

Modern X86 operating systems are transitioning away from CALL FAR callgates. With the introduction of SYSENTER/SYSEXIT and SYSCALL/SYSRET, a new faster mechanism was introduced for control transfers for x86 programs. [...]

It should be noted that call gates are more flexible than the SYSENTER/SYSEXIT and SYSCALL/SYSRET instructions since unlike the latter two, call gates allow for changing from an arbitrary privilege level to an arbitrary privilege level. The fast SYS* instruction only allow control transfers from ring 3->0 and vice versa. Upon comparing call gates to interrupts, call gates are significantly faster.

AJ · Post by AJ » Wed Jul 16, 2008 5:00 am

Hi,

I don't know much about call gates as I never use them, but generally you will use one of three methods for user/kernel transitions:

1. System Call Interrupt (as you suggest - this is like Linux Int 0x80). Each call requires a privilege transition, register storage, stack switch and possible task space switch (if the call requires it, which will happen a lot if you are writing a microkernel) - sloooow (but secure if implemented well)!
2. SYSENTER/SYSEXIT, SYSCALL/SYSRET - if supported (check with CPUID), this is a lot faster than a basic system call interrupt and foregoes a lot of the privilege transitions. You also have a proper specification for this written by Intel/AMD which will tell you which registers are scratch registers and which need to be preserved.
3. Managed Code - this has the potential to be a very fast system as you do not need any privilege level / task space switches. It also has the potential to be very unsecure/unstable if you don't implement your base kernel and JIT compiler well. Although the system calls are potentially fast, JIT compilation could slow down your code generally.

Cheers,
Adam

Jeko · Post by **Jeko** » Wed Jul 16, 2008 5:01 am

Solar wrote:Wikipedia writes about Call Gates:

Modern X86 operating systems are transitioning away from CALL FAR callgates. With the introduction of SYSENTER/SYSEXIT and SYSCALL/SYSRET, a new faster mechanism was introduced for control transfers for x86 programs. [...]

It should be noted that call gates are more flexible than the SYSENTER/SYSEXIT and SYSCALL/SYSRET instructions since unlike the latter two, call gates allow for changing from an arbitrary privilege level to an arbitrary privilege level. The fast SYS* instruction only allow control transfers from ring 3->0 and vice versa. Upon comparing call gates to interrupts, call gates are significantly faster.

So call gates are faster... But why most operating systems use interrupts?

However I'll use SYSENTER/SYSEXIT and SYSCALL/SYSRET (and syscalls for processors that doesn't support these instructions)

Jeko · Post by **Jeko** » Wed Jul 16, 2008 5:05 am

AJ wrote:3. Managed Code - this has the potential to be a very fast system as you do not need any privilege level / task space switches. It also has the potential to be very unsecure/unstable if you don't implement your base kernel and JIT compiler well. Although the system calls are potentially fast, JIT compilation could slow down your code generally.

But managed code is slower than machine code. Or not?
And with managed code there isn't a fully virtual space for each process. Am I right?

However I think that it's better an OS in normal code than an OS in managed code

Brendan · Post by **Brendan** » Wed Jul 16, 2008 5:35 am

Hi,

Jeko wrote:Which are pros and cons of syscalls and pros and cons of call gates? Which is the fastest method to handle user to kernel interfacing?

Given the choice between software interrupts, call gates, SYSENTER and SYSCALL, why not just implement all of them?

Note1: if the CPU doesn't support the SYSENTER and/or SYSCALL instructions, you can emulate the instructions inside your invalid opcode exception handler. This would be slower than a software interrupt, but it's more fun than doing "if (CPU_supports_SYSENTER() ) { FOO } else { BAR }" everywhere.

Note2: software interrupts are slower than call gates, but "INT n" only costs 2 bytes while a "CALL FAR" costs 7 bytes, so if you're optimizing for size (e.g. application startup code that's only run once) and don't want to know if SYSENTER/SYSCALL are supported then using a software interrupt is better. The "INT3" instruction is only one byte (which makes it the smallest possible option) but it's probably better to use it for debugging purposes.

Note3: For 64-bit code, just use SYSCALL. The other options are for 16-bit and/or 32-bit code (including 16-bit and/or 32-bit code running under a 64-bit OS).

Cheers,

Brendan

Jeko · Post by **Jeko** » Wed Jul 16, 2008 5:43 am

Brendan wrote:Hi,

Jeko wrote:Which are pros and cons of syscalls and pros and cons of call gates? Which is the fastest method to handle user to kernel interfacing?
Given the choice between software interrupts, call gates, SYSENTER and SYSCALL, why not just implement all of them?

Note1: if the CPU doesn't support the SYSENTER and/or SYSCALL instructions, you can emulate the instructions inside your invalid opcode exception handler. This would be slower than a software interrupt, but it's more fun than doing "if (CPU_supports_SYSENTER() ) { FOO } else { BAR }" everywhere.

Note2: software interrupts are slower than call gates, but "INT n" only costs 2 bytes while a "CALL FAR" costs 7 bytes, so if you're optimizing for size (e.g. application startup code that's only run once) and don't want to know if SYSENTER/SYSCALL are supported then using a software interrupt is better. The "INT3" instruction is only one byte (which makes it the smallest possible option) but it's probably better to use it for debugging purposes.

Note3: For 64-bit code, just use SYSCALL. The other options are for 16-bit and/or 32-bit code (including 16-bit and/or 32-bit code running under a 64-bit OS).

Cheers,

Brendan

So why don't use INT3 instead of INT80?

However I'd like to know how to implement the first thing. But I must do an IF to check if there is SYSENTER or there is SYSCALL

AJ · Post by AJ » Wed Jul 16, 2008 5:50 am

Jeko wrote:So why don't use INT3 instead of INT80?

Do you mean int 0x06 which is the Invalid Opcode Exception? If you did it that way, you wouldn't know if it really is an invalid opcode, so you need to do the checks suggested by Brendan.

Cheers,
Adam

Jeko · Post by **Jeko** » Wed Jul 16, 2008 6:10 am

AJ wrote:
Jeko wrote:So why don't use INT3 instead of INT80?
Do you mean int 0x06 which is the Invalid Opcode Exception? If you did it that way, you wouldn't know if it really is an invalid opcode, so you need to do the checks suggested by Brendan.

Cheers,
Adam

I mean, why for syscall Linux use int 0x80 and not int 3 if int 3 opcode is smaller?

And why most OSes use interrupts instead of call gates if call gates are faster?

However when an invalid opcode exception occurs how can I check that it's a SYSCALL/SYSENTER or a SYSRET/SYSEXIT?

AJ · Post by AJ » Wed Jul 16, 2008 7:22 am

Hi,

You really need to do some background reading and look at the Intel Manuals. Int 0x03 is the Breakpoint Exception - INT 0x00-0x1F should be reserved by your kernel for handling CPU exceptions.

However when an invalid opcode exception occurs how can I check that it's a SYSCALL/SYSENTER or a SYSRET/SYSEXIT?

You have the EIP where the exception occurred - simply check the Opcode (which can be found in the Intel Manuals).

Cheers,
Adam

Jeko · Post by **Jeko** » Wed Jul 16, 2008 7:52 am

AJ wrote:You really need to do some background reading and look at the Intel Manuals. Int 0x03 is the Breakpoint Exception - INT 0x00-0x1F should be reserved by your kernel for handling CPU exceptions.

I know that INT 3 is the Breakpoint Exception...
But:

Brendan wrote:The "INT3" instruction is only one byte (which makes it the smallest possible option) but it's probably better to use it for debugging purposes.

INT 3 is the breakpoint exception, but I can use it how I want.

Colonel Kernel · Post by **Colonel Kernel** » Wed Jul 16, 2008 9:12 am

AJ wrote:3. Managed Code - this has the potential to be a very fast system as you do not need any privilege level / task space switches. It also has the potential to be very unsecure/unstable if you don't implement your base kernel and JIT compiler well. Although the system calls are potentially fast, JIT compilation could slow down your code generally.

JIT compilation is not a requirement for so-called "managed code". The only requirement is the ability to verify the code for type-safety before it runs, and to disallow modifying the code.

Jeko wrote:But managed code is slower than machine code. Or not?

Depends how it's implemented. There is a lot of general misunderstanding about how it works. I suggest reading the Singularity research papers if you want to know more.

And with managed code there isn't a fully virtual space for each process. Am I right?

Whether or not there is a virtual address space for each process becomes optional with software isolation. In Singularity for example, it is a configurable option (maybe a compile-time option for the kernel, I'm not sure).

jnc100 · Post by **jnc100** » Wed Jul 16, 2008 10:59 am

Jeko wrote:INT 3 is the breakpoint exception, but I can use it how I want.

You can, however then you become limited with your debugging options. Int 3 (as a single opcode) is useful because a debugger can replace any instruction with it to create a breakpoint. You can replace a two byte instruction with a two byte opcode easily enough and so on, but it becomes more difficult when the instruction you want to break on is only one byte. For example, say you have:

Code: Select all

func1:
  add eax, 2;
  ret;

func2:
  mov eax, 5;
  call func1;
  ret;

Now you can easily break on the first line of func1 by replacing it with int3, or int 4,5,6 etc. Then your handler will trigger the 'break_point_hit' code, and restore the actual opcodes before iret to continue. To be short, it doesn't matter how much of the 'add eax, 2' opcodes you overwrite, because you will restore them before they are executed. The problem is if you want to break on the ret in func1. Replacing it with anything longer than a single opcode will overwrite the start of func2, and you cannot guarantee that that code will not be executed before the breakpoint (in func1) is hit (and the code restored), therefore a single byte interrupt instruction is very useful to have around. IMHO its best to preserve the special encoding of int 3 for this purpose.

Regards,
John.

Jeko · Post by **Jeko** » Wed Jul 16, 2008 11:38 am

Colonel Kernel wrote:
Jeko wrote:But managed code is slower than machine code. Or not?
Depends how it's implemented. There is a lot of general misunderstanding about how it works. I suggest reading the Singularity research papers if you want to know more.

I think managed code MUST be slower than machine code. It's normal.

jnc100 you're right.

However I think I'll use interrupts syscalls and sysenter/syscall and sysexit/sysret. The only problem is that I must do an IF for each call.

jal · Post by **jal** » Wed Jul 16, 2008 11:55 am

Jeko wrote:I think managed code MUST be slower than machine code. It's normal.

No, not necessarily. Managed code is checked once, to verify that it doesn't do anything dangerous. After that, it can be run at full speed.

JAL

OSDev.org

Syscalls versus Call Gates

Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates