Hi,
CPLH wrote:I didn't look into the details syscall functionality, as I heard it only works in 64 bits. If it automatically sets ecx and edx, then it is more convenient than sysenter as you don't have to mess around trying to figure out efficient methods to get ecx and edx to work.
No - SYSCALL first existed on 32-bit AMD CPUs, and I don't think any 32-bit AMD CPU ever supported SYSENTER. 32-bit Intel CPUs had SYSENTER instead and didn't support SYSCALL.
Then AMD introduced 64-bit 80x86 CPUs, and (IIRC) in their early documentation stated that if a CPU supports long mode then software can assume it also supports other features, including PAE and SYSCALL and some other stuff (I really wish I could find this list again now).
This put Intel in an awkward situation - they wanted to sell Itanium, but had to produce a 64-bit 80x86 to maintain market share (I'd assume AMD really didn't like the idea of everyone shifting to Itanium, because they have patent agreements with Intel for 80x86 but don't have similar agreements with Intel for Itanium, so AMD probably couldn't compete if everyone shifted to Itanium). Anyway, Intel had to support SYSCALL to support long mode, because software developers expected SYSCALL to work in long mode.
However, for 32-bit code software developers didn't really use SYSCALL (but did use SYSENTER), so eventually AMD did introduce support for SYSENTER, but only for 32-bit code. This means that an AMD CPU running 64-bit code won't support SYSENTER, but the exact same CPU running 32-bit code (either in protected mode or in long mode) does support SYSENTER.
To make this more confusing there's only 1 feature flag returned by CPUID to tell the OS if SYSENTER is supported, and only 1 feature flag returned by CPUID to tell the OS if SYSCALL is supported - there's no easy way to tell which instructions are supported in which operating mode.
So, here's a summary!
AMD CPUs:- If CPUID says SYSCALL is supported, then SYSCALL is supported for 32-bit code, and if the CPU supports long mode SYSCALL is also supported in 64-bit code.
- If CPUID says SYSENTER is supported, then SYSENTER is only supported for 32-bit code, and *not* supported in 64-bit code.
Intel CPUs:- If CPUID says SYSCALL is supported, then SYSCALL is supported for 32-bit code (but AFAIK there aren't any Intel CPUs that support 32-bit SYSCALL, so the feature flag in CPUID will always be clear). However, if the CPU supports long mode then SYSCALL is supported in 64-bit code (even though CPUID says it isn't supported).
- If CPUID says SYSENTER is supported, then SYSENTER is supported for 32-bit code, and if the CPU supports long mode then SYSENTER is also supported in 64-bit code.
Also note that for my OS, boot code examines CPU features, etc and builds it's own set of feature flags in RAM (and does some other stuff - brand strings, errata, etc), and the rest of the OS never uses CPUID but uses the feature flags in RAM instead. This makes it easy to have a SYSCALL32 flag, a SYSCALL64 flag, a SYSENTER32 flag and SYSENTER64 flag.
Note: I honestly wish there was "disable CPUID for CPL=3 code" flag in CR4 (like the "disable RDTSC for CPL=3 code" flag) so that I can force applications to use the kernel's standardized/unambiguous "CPU information" functions instead of the dodgy mess that CPUID has become, but I'm starting to get off-topic...
If the CPU doesn't support SYSCALL or SYSENTER, then any code that uses the unsupported instruction will generate an invalid opcode exception. Your invalid opcode exception handler can use the return EIP on the exception handler's stack to figure out which instruction was being used, and if it was SYSCALL or SYSENTER it can emulate the instruction. This is what I meant earlier by "emulated SYSCALL/SYSENTER".
Finally, for my OS, the kernel has a large table of function pointers for the 32-bit kernel API, and for 64-bit versions of the OS the kernel has a second large table of function pointers for the 64-bit kernel API. The 64-bit kernel API is only used by SYSCALL, and the SYSCALL code just does "call [kernel_API_table_64 + rax * 8]" before doing SYSRET. The 32-bit kernel API is used by a software interrupt, a call gate, SYSCALL, SYSENTER, and the invalid opcode handler (emulated SYSCALL and emulated SYSENTER); where all of these things do "call [kernel_API_table_32 + eax * 4]" before returning using whatever method is appropriate (IRETD, RETF, SYSRET, SYSEXIT or IRETD).
This means that for 32-bit code I can do something like:
Code: Select all
%macro CALL_KERNEL %1
mov eax,%1
%ifdef USE_SYSENTER
push ecx
push edx
mov ecx,esp
mov edx,%%1
sysenter
pop edx
pop ecx
%%1:
%elifdef USE_SYSCALL
syscall
%elifdef USE_CALL_GATE
call KERNEL_API_GATE:0x00000000
%else
int KERNEL_API_TRAP
%endif
%endmacro
Code: Select all
mov ebx, foo
mov esi, bar
CALL_KERNEL function_number
This means that regardless of which method is actually used (and regardless of whether SYSCALL/SYSENTER are used when the CPU doesn't support them), everything involved with the 32-bit kernel API behaves exactly the same...
I should also mention that (AFAIK) for most OSs (e.g. Windows, Linux) the application doesn't really use the kernel API directly - the application uses a shared libary or DLL, and the shared libary or DLL uses the kernel API. This allows a different library to be used to suit the situation (e.g. if SYSCALL is supported then use the library that uses SYSCALL, if SYSENTER is supported then use the library that uses SYSENTER, else use the library that uses the software interrupt). This is a valid way of doing things (if your OS supports shared libraries or DLLs and the potential for "dependency hell"), but at a minimum it also adds the cost of a near call/ret to each kernel API call.
Cheers,
Brendan