Perhaps, but speed is more important on older processors, and even if you can execute the syscall/sysret twice, that is typically not even close to the whole procedure of setting up the call, syscall, decoding it and jumping to the final destination. I've seen the coding/decoding code both in DOS and Windows, and it's terribly slow. In the past people have minimized syscalls just because they are so terribly slow. I never do that because my syscalls are fast, and I can set up my compiler (Open Watcom) to use the registers the call is defined to use and thus eliiminate the intermediate step of loading registers from the stack which I needed with Borland's compiler. In essence, the call will go directly from C/C++ to kernel with no coding/decoding overhead.Owen wrote:By the time you have even passed through a call gate on a modern processor, you could pretty much have been through syscall/sysret twice.rdos wrote:I think the fastest way to do syscalls on x86 is to allocate a callgate with every entrypoint. This will leave all CPU-registers available (no need to use & copy the stack in most (all) cases). It doesn't need to setup function numbers on entry, and it doesn't need decoding functions in the kernel, and eventually to do a call / jmp to the real entrypoint. The only drawback is that GDT selectors are a limited resource.
Kernel requests via page faults
Re: Kernel requests via page faults
Re: Kernel requests via page faults
Yes, there is a need to look at the whole sequence, not just the switch from user to kernel. Speed should be meassured from the last useful instruction in C/C++ until the first useful instruction in the device-driver. One possibly advantage of using pagefaults would be similar to using callgates in that the handler destination could be coded somewhere directly and this could eliminate the usual coding/decoding logic of syscall/sysexit.lemonyii wrote:however, nice idea.
its good for non-assembly programming, and easy to implement. but i didnt consider the speed.
any way, it's just an entrance , it varies from different platforms.
my opinion is, keep the central part of code unchanged, and choose the most practical (fastest, easiest decoding, least exceptions......) entrance on the platform.
and of course, we may have many entrance, but we dont need it i think.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Kernel requests via page faults
GCC knows how to inline my system calls too. More importantly, my syscall dispatch code comes down to little more thanrdos wrote:Perhaps, but speed is more important on older processors, and even if you can execute the syscall/sysret twice, that is typically not even close to the whole procedure of setting up the call, syscall, decoding it and jumping to the final destination. I've seen the coding/decoding code both in DOS and Windows, and it's terribly slow. In the past people have minimized syscalls just because they are so terribly slow. I never do that because my syscalls are fast, and I can set up my compiler (Open Watcom) to use the registers the call is defined to use and thus eliiminate the intermediate step of loading registers from the stack which I needed with Borland's compiler. In essence, the call will go directly from C/C++ to kernel with no coding/decoding overhead.Owen wrote:By the time you have even passed through a call gate on a modern processor, you could pretty much have been through syscall/sysret twice.rdos wrote:I think the fastest way to do syscalls on x86 is to allocate a callgate with every entrypoint. This will leave all CPU-registers available (no need to use & copy the stack in most (all) cases). It doesn't need to setup function numbers on entry, and it doesn't need decoding functions in the kernel, and eventually to do a call / jmp to the real entrypoint. The only drawback is that GDT selectors are a limited resource.
- Check that the system call number doesn't exceed the maximum legal value
- Jump to the correct syscall
- Return
From memory, the syscall code looks like
Code: Select all
syscallEntry64:
swapgs
mov %gs:TCB_RSP0_OFFSET, %rsp
sti
cmpq $SYSCALL_COUNT, %rax
jlt _badSyscallNumber
mov syscallVec(%rax, 4, 0), %rax
call *%rax
swapgs
sysretq
Code: Select all
static inline syscallReturn1(_FMK_ksysparam retV)
{
asm volatile("swapgs; sysretq" :: "a"(retV.as_int));
__builtin_unreachable();
}
Register pressure is a complete non-issue; one should be re-evaluating things if they need that many parameters (particularly when developing an asynchronous microkernel!)
Re: Kernel requests via page faults
OK, that looks pretty good. At least compared to DOS/Windows. However, I would want to see the complete code, which includes how the compiler codes the call in your application, and if something special is required in the device-driver.Owen wrote:GCC knows how to inline my system calls too. More importantly, my syscall dispatch code comes down to little more thanOK; theres a little more complexity than this: I have to SWAPGS to get the kernel per-CPU information (but this is a given either way) and re-enable interrupts, but the overhead is far less than that of a call gate.
- Check that the system call number doesn't exceed the maximum legal value
- Jump to the correct syscall
- Return
From memory, the syscall code looks like
I could probably streamline things by defining an inline function which does something along the lines ofCode: Select all
syscallEntry64: swapgs mov %gs:TCB_RSP0_OFFSET, %rsp sti cmpq $SYSCALL_COUNT, %rax jlt _badSyscallNumber mov syscallVec(%rax, 4, 0), %rax call *%rax swapgs sysretq
and jumping directly to the syscall functions. In fact, I'll probably move to doing this; it makes returning multiple values trivial too.Code: Select all
static inline syscallReturn1(_FMK_ksysparam retV) { asm volatile("swapgs; sysretq" :: "a"(retV.as_int)); __builtin_unreachable(); }
Register pressure is a complete non-issue; one should be re-evaluating things if they need that many parameters (particularly when developing an asynchronous microkernel!)
Lets take readfile call as an example. It is defined like this for C/C++ code:
The call-side macro looks like this:int RDOSAPI RdosReadFile(int Handle, void *Buf, int Size);
This tells the compiler to load handle into ebx, buffer into edi and size into ecx and to return bytes read into eax.#pragma aux RdosReadFile = \
CallGate_read_file \
ValidateEax \
parm [ebx] [edi] [ecx] \
value [eax];
It will typically expand to something like this:
The device-driver will register its entry-point with "usergate-manager", and it will contain no extra code in the entry portion. It doesn't need to validate parameters (it will use the user-mode es selector to access the buffer, and if it fails, the code faults. Userlevel cannot pass pointers to kernel-space because the user-mode-es register does not map kernel). The handle will be "dereferenced" by "handle manager", but this is typically a fast procedure.mov ebx,filehandle
mov edi,buffer
mov ecx,size
call far 0xE800:0x00000000 ; selector will be dynamically allocated on first call, it is just an example.
jnc read_ok
;
xor eax,eax
read_ok:
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Kernel requests via page faults
The standard system call wrapper is:
_FMK_AMD64_SCDEF is normally "static inline"
Normally, this is called from code like
(_FMK_Syscall is very similar except only returns in RAX, rather than in additional registers)
The general generated code is
Code: Select all
_FMK_AMD64_SCDEF _FMK_sysparam _FMK_SyscallR(_FMK_syscall sc,
_FMK_u8 argc, const _FMK_sysparam argv[],
_FMK_u8 retc, const _FMK_sysretn retv[]
)
{
_FMK_u8 i;
_FMK_sysparam rv;
register _FMK_sysparam arg0 _FMK_asmreg(rdi);
register _FMK_sysparam arg1 _FMK_asmreg(rsi);
register _FMK_sysparam arg2 _FMK_asmreg(rdx);
register _FMK_sysparam arg3 _FMK_asmreg(rcx);
switch(__builtin_constant_p(argc) ? argc : _FMK_SC_MAX_ARGS) {
case 4: arg3 = argv[3];
case 3: arg2 = argv[2];
case 2: arg1 = argv[1];
case 1: arg0 = argv[0];
}
switch(__builtin_constant_p(retc) ? retc : _FMK_SC_MAX_ARGS) {
case 1:
__asm__ __volatile__("int $0xFF"
: "=a"(rv),
"=D"(*retv[0])
: "0" (sc),
"1" (arg0),
"S" (arg1),
"d" (arg2),
"c" (arg3)
: "memory"
); break;
case 2:
__asm__ __volatile__("int $0xFF"
: "=a"(rv),
"=D"(*retv[0]),
"=S"(*retv[1])
: "0" (sc),
"1" (arg0),
"2" (arg1),
"d" (arg2),
"c" (arg3)
: "memory"
); break;
case 3:
__asm__ __volatile__("int $0xFF"
: "=a"(rv),
"=D"(*retv[0]),
"=S"(*retv[1]),
"=d"(*retv[2])
: "0" (sc),
"1" (arg0),
"2" (arg1),
"3" (arg2),
"c" (arg3)
: "memory"
); break;
case 4:
__asm__ __volatile__("int $0xFF"
: "=a"(rv),
"=D"(*retv[0]),
"=S"(*retv[1]),
"=d"(*retv[2]),
"=c"(*retv[3])
: "0" (sc),
"1" (arg0),
"2" (arg1),
"3" (arg2),
"4" (arg3)
: "memory"
); break;
}
return rv;
}
Normally, this is called from code like
Code: Select all
FMK_result FMK_KDebugOutChar(char c)
{
return (FMK_result)
_FMK_Syscall(_FMK_SC_DebugOut,
_FMK_SC_ARGC(1, 0, 0, 0, 0, 0),
_FMK_SC_ARGV(_FMK_SCA_U8(c))
);
}
The general generated code is
Code: Select all
FMK_KDebugOutChar:
mov $_FKM_SC_DebugOut, %eax
syscall
ret
Re: Kernel requests via page faults
You would need some serious revision of this code once you come closer to "production stage", because user-level can pass any kind of garbage to the device-driver. It can even trash your kernel by deliberatly or accidentally putting addresses to kernel-space data structures in the parameters.
Part of my strategy is to minimize pointers and data-structures passed from userlevel to kernel, as these either need some hardware protection (the use of segreg:offset where segreg can never access kernel data), or some kind of parameter validation in software in the entry portion of the device-driver. For 64-bit code, the only option is software validation since segmentation is not supported.
Part of my strategy is to minimize pointers and data-structures passed from userlevel to kernel, as these either need some hardware protection (the use of segreg:offset where segreg can never access kernel data), or some kind of parameter validation in software in the entry portion of the device-driver. For 64-bit code, the only option is software validation since segmentation is not supported.
Re: Kernel requests via page faults
Assuming an upper-half kernel, pointer validation only requires a simple comparison. Surely that is not an issue.rdos wrote:You would need some serious revision of this code once you come closer to "production stage", because user-level can pass any kind of garbage to the device-driver. It can even trash your kernel by deliberatly or accidentally putting addresses to kernel-space data structures in the parameters.
Part of my strategy is to minimize pointers and data-structures passed from userlevel to kernel, as these either need some hardware protection (the use of segreg:offset where segreg can never access kernel data), or some kind of parameter validation in software in the entry portion of the device-driver. For 64-bit code, the only option is software validation since segmentation is not supported.
If a trainstation is where trains stop, what is a workstation ?
Re: Kernel requests via page faults
In the example above there are many pointers that need validation. It is also easy to forget some validation since these pointers are "hidden".gerryg400 wrote:Assuming an upper-half kernel, pointer validation only requires a simple comparison. Surely that is not an issue.
Re: Kernel requests via page faults
Remember that we are looking at the user side of the system call here. No validation is needed there at all.rdos wrote:In the example above there are many pointers that need validation. It is also easy to forget some validation since these pointers are "hidden".
If a trainstation is where trains stop, what is a workstation ?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Kernel requests via page faults
Indeed. It also occurs to me that I need to swivel some registers to avoid syscall trampling them. There is a version of this code for each supported platform (At present, AMD64 and ARM)gerryg400 wrote:Remember that we are looking at the user side of the system call here. No validation is needed there at all.rdos wrote:In the example above there are many pointers that need validation. It is also easy to forget some validation since these pointers are "hidden".
Re: Kernel requests via page faults
Swivel ? What do you mean ?I need to swivel some registers
If a trainstation is where trains stop, what is a workstation ?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Kernel requests via page faults
RCX, for example, is used by syscall for passing the return address, so can't be used for parameter passing. The RCX slot will be swapped for a different callee-clobber register, and RCX added to the clobber list. The same kind of note would apply to R11, if it were involved in parameter passing.gerryg400 wrote:Swivel ? What do you mean ?I need to swivel some registers
The stub for many parameter system calls, when not inlined, would then look like:
Code: Select all
FMK_ABigSyscallStub:
mov $_FMK_SC_ABigSyscall, %eax
mov %rcx, %r8
syscall
ret
Re: Kernel requests via page faults
Okay. That reminds me. I'm still using to enter my kernel. Must fix that!
Code: Select all
int
If a trainstation is where trains stop, what is a workstation ?
Re: Kernel requests via page faults
IIRC the fastest way to do a syscall on 64-bit x86 is SYSCALL/SYSRET, as they've been specially optimised for this case.rdos wrote:I think the fastest way to do syscalls on x86 is to allocate a callgate with every entrypoint. This will leave all CPU-registers available (no need to use & copy the stack in most (all) cases). It doesn't need to setup function numbers on entry, and it doesn't need decoding functions in the kernel, and eventually to do a call / jmp to the real entrypoint. The only drawback is that GDT selectors are a limited resource.
Re: Kernel requests via page faults
Except for exceptions, it is more or less the only way on 64-bit. However, I don't target 64-bit, rather 386+ processors (IA32) only, and I have no plans to switch to 64-bit. There is no need for 64-bit code on embedded platforms.JamesM wrote:IIRC the fastest way to do a syscall on 64-bit x86 is SYSCALL/SYSRET, as they've been specially optimised for this case.rdos wrote:I think the fastest way to do syscalls on x86 is to allocate a callgate with every entrypoint. This will leave all CPU-registers available (no need to use & copy the stack in most (all) cases). It doesn't need to setup function numbers on entry, and it doesn't need decoding functions in the kernel, and eventually to do a call / jmp to the real entrypoint. The only drawback is that GDT selectors are a limited resource.