Hi,
gerryg400 wrote:I understand the comments but the purpose of the test was to compare "syscall/sysret" to "int/iretq". Most documentation tells us that the former pair is "4 times quicker" (or something similar) to the latter. I've always felt that this is a useless way of comparing the instructions unless you know how much the rest of the system call costs.
I've always thought that, because syscall/sysret doesn't do some things that are likely to be necessary (like switching ESP to a kernel stack), it isn't directly comparable to software interrupts or call gates (or SYSENTER) because a kernel typically needs to add more instructions to a syscall handler that wouldn't have been necessary for software interrupts or call gates.
For an extreme example; because ESP/RSP isn't switched and the CPU doesn't push anything on the stack while at CPL=3, user space code could do "mov rsp, SOMEWHERE_IN_KERNEL_SPACE" and then "SYSCALL" and trick the kernel into trashing itself or modifying kernel data. To guard against that, the kernel has to save RSP somewhere and load RSP with a "known good" value before anything is pushed on the stack (either by the SYSCALL handler itself or by the CPU if an NMI or machine check exception occurs).
Note: To be honest, I'm not even sure if it's possible to use SYSCALL in a "guaranteed 100.0000% safe" way (as you can't prevent NMI or machine check before the SYSCALL handler switches to a safe stack, and task switching and IST fails for nesting).
For worst case, you'd need to deal with malicious user space code that does something like this:
Code: Select all
mov eax,0
mov ds,eax
mov es,eax
mov fs,eax
mov gs,eax
mov esp,SOMEWHERE_IN_KERNEL_SPACE
syscall
gerryg400 wrote:As turdus points out there is another saving with sys call/sysret (i.e. not really needing to save the GP regs on a syscall). But surely this is true for the int/iretq situation as well ?
Yes.
For a fair comparison that isn't overly effected by OS design, you'd want to compare:
- software interrupt with nothing more than IRET
- call gate with nothing more than RETF
- SYSENTER with nothing more than SYSEXIT
- the minimum safe SYSCALL handler
The result won't be entirely OS neutral as different OSs will take different approaches for the minimum safe SYSCALL handler (e.g. the "
mov esp, *something*" part).
I'd also suggest that the caller's code size also be taken into account. SYSCALL and "INT n" both cost 2 bytes. For SYSENTER, the caller needs to store "return EIP" and "return ESP" somewhere (likely EDX and ECX), so even though SYSENTER is only 2 bytes itself it's probably going to cost 6 or more bytes. For 32-bit code, call gates are going to cost a minimum of 6 bytes (using a 16-bit address size override prefix to avoid the full 32-bit offset that's ignored anyway).
I'd expect that SYSENTER would end up being the winner for performance (for frequently executed pieces of code), and SYSCALL and software interrupts would tie for code size (for infrequently used code).
Cheers,
Brendan