inline system calls?

rdos · Post by **rdos** » Sat May 06, 2017 2:44 am

I always inline "neutral" code that generates an exception, and then patches it in the exception handler to whatever code the setup uses.

LtG · Post by **LtG** » Sat May 06, 2017 6:16 am

Brendan wrote:The CALL instruction is relatively simple. SYSCALL is more complex and has multiple edge cases, where it might be fine most of the time but on rare occasions several condition occur at the same time and you end up with something like this or this.

As Intel points out, those two are software issues. Yes, you have to be careful and think things thru.

Brendan wrote: For software that hopes to be around for a while; it's good practice to design it with some flexibility, so that if things change in future you have a way to deal with those changes (that doesn't break backward compatibility).

The key is where you put that flexibility, in the right place you get all the flexibility you need without impacting performance at all or very little.

For example if you are using byte code (or IL or something else) you can pick the best option at install time when doing the final conversion from byte code to machine code specific to the machine, which also allows you to use the best CPU instruction the system happens to support for everything, not just syscall.

Brendan wrote: In theory it's possible to create tools that fix all the "compile before use is too painful/annoying/fragile for C" problems; but in practice nobody ever will - they either won't be willing to write their own tools at all, or they'll go all the way and abandon C itself (e.g. create their own language).

"Ever" is a long time, hopefully someone will or C gets replaced. I mean all the make/build tools that exist clearly show that there is a problem, compilers creating dependency files (listing the headers that are included) so that Make knows what to do, etc... It's a mess, sure it can be used but it's too complex and takes too much effort..

And yes, I have considered creating a new language as it would have significant benefits, but that just adds to the effort initially so I haven't done so yet. Hopefully the effort gets repaid so it's not wasted effort =)

I think part of the problem is that C was initially created for creating Unix, so they created the language so they could write their OS in it. Thing is, our OS's aren't Unix (and it's decades later) and we're still trying to use that language.

LtG · Post by **LtG** » Sat May 06, 2017 6:20 am

rdos wrote:I always inline "neutral" code that generates an exception, and then patches it in the exception handler to whatever code the setup uses.

The "neutral" code is #UD or something like "int 0x80"?

But what's the benefit of doing it that way? Compared to putting it in the places the binary image (ELF, PE, what ever) explicitly requests? Don't you also need padding to account for possibly longer "syscall" sequences later?

Also isn't that like self-modifying code? I'm not against it, but won't it then interfere with an application that itself wants to do self-modifying of its own (usually problematic when two entities are modifying the same thing)..? I understand that you might not want to support self-modifying so for you it's a moot point, but from theory perspective why pick a solution that is more limiting with no extra benefits?

Simply put, is there anything that makes that solution better than the others?

rdos · Post by **rdos** » Mon May 08, 2017 12:31 pm

LtG wrote:
rdos wrote:I always inline "neutral" code that generates an exception, and then patches it in the exception handler to whatever code the setup uses.
The "neutral" code is #UD or something like "int 0x80"?

But what's the benefit of doing it that way? Compared to putting it in the places the binary image (ELF, PE, what ever) explicitly requests? Don't you also need padding to account for possibly longer "syscall" sequences later?

Also isn't that like self-modifying code? I'm not against it, but won't it then interfere with an application that itself wants to do self-modifying of its own (usually problematic when two entities are modifying the same thing)..? I understand that you might not want to support self-modifying so for you it's a moot point, but from theory perspective why pick a solution that is more limiting with no extra benefits?

Simply put, is there anything that makes that solution better than the others?

No, it is not a simple int x code.

It's like this:

Code: Select all

    db 55h
    db 67h
    db 9Ah
    dd gate_nr
    dw 3
    db 5Dh

When disassembled, it will be like this:

Code: Select all

    push ebp
    call far 0003:gate_nr
    pop ebp

The middle instruction will GP fault because of an access to null selector, and then typically is patched to an interrupt:

Code: Select all

    db 55h
    db 0CDh
    db 9Ah
    dd gate_nr
    dw 3
    db 5Dh

Resulting code:

Code: Select all

    push ebp
    int 9Ah
    dd gate_nr
    dw 3
    pop ebp

The re-execution of the faulting instruction will then execute int 9A, which will go to the kernel int 9A handler. This handler will set the gate number to 0 and the selector to the new call-gate (and create it if it already didn't exist). Lastly, it will overwrite the int opcode with 90h:

Code: Select all

    db 55h
    db 90h
    db 9Ah
    dd 0
    dw gate-sel
    db 5Dh

Corresponding to:

Code: Select all

    push ebp
    nop
    call gate-sel:0
    pop ebp

The reason for the two-step process is to make this process multicore safe. The fault handlers are a little more complex because of this. Because it is multicore-safe, multiple cores can call this code at the same time, and both will end up calling the correct syscall.

The reason the initial code is a far-call is so the application debugger knows that it is a call (and thus is traceble), and so it can set a breakpoint after it to execute it. Invalid opcodes don't have these features.

When patching for sysenter instead, the code will be modified like this in the second stage (again, the int opcode is patched last):

Code: Select all

    db 55h
    db 90h
    db 0E8h
    dd OFFSET app-stub
    dw 9090h
    db 5Dh

In code:

Code: Select all

    push ebp
    nop
    call app_stub
    nop
    nop
    pop ebp

One app stub needs to be created in user-space per syscall, and it looks like this:

Code: Select all

    push ecx
    push edx
    mov ecx,esp
    mov edx,gate_nr
    sysenter
    pop edx
    pop ecx
    ret

As can be seen, this is less efficient, but so is the use of sysenter. A note is that the OS also must implement an int 0E8h handler that is a copy of the int 9A handler, to handle the case where a core executes the int instruction before it is modified but after the vector is modified.

Schol-R-LEA · Post by **Schol-R-LEA** » Mon May 08, 2017 12:56 pm

Purely as an aside: why didn't you use code tags? I realize that formatting isn't significant in the cases of assembly code, and that the snippets are exceedingly short, but, well, if nothing else, it gives a poor impression to newbies that a long-standing member is doing this. I expect that there may be a reason for this (e.g., you were posting from a cell phone or tablet and using the tags was impractical), but it night be worth making mention of it if so.

rdos · Post by **rdos** » Mon May 08, 2017 1:17 pm

Schol-R-LEA wrote:Purely as an aside: why didn't you use code tags? I realize that formatting isn't significant in the cases of assembly code, and that the snippets are exceedingly short, but, well, if nothing else, it gives a poor impression to newbies that a long-standing member is doing this. I expect that there may be a reason for this (e.g., you were posting from a cell phone or tablet and using the tags was impractical), but it night be worth making mention of it if so.

OK, I changed the post. Not used to code tags nowadays.

Schol-R-LEA · Post by **Schol-R-LEA** » Mon May 08, 2017 2:41 pm

rdos wrote:
Schol-R-LEA wrote:Purely as an aside: why didn't you use code tags? I realize that formatting isn't significant in the cases of assembly code, and that the snippets are exceedingly short, but, well, if nothing else, it gives a poor impression to newbies that a long-standing member is doing this. I expect that there may be a reason for this (e.g., you were posting from a cell phone or tablet and using the tags was impractical), but it night be worth making mention of it if so.
OK, I changed the post. Not used to code tags nowadays.

Fair enough; many, perhaps most, newer message boards use Snarkdown instead of BBgoad, so that would be reason enough. I've had the same problem too, before, assuming that's what happened.

LtG · Post by **LtG** » Tue May 09, 2017 12:30 pm

rdos, is there a reason you want to push and pop ebp? Also why not simply leave required number of bytes (you seem to use 8 ) in the image and record these locations and patch the code at load/install time, instead of during run time? Given that normal apps shouldn't have that many syscalls I don't expect that load/install time will give much better performance but certainly won't be worse, and as an added bonus you don't have to worry/care about multicore issues with that approach.

So I'm asking what is the benefit you see with this more complicated approach?

rdos · Post by **rdos** » Mon May 15, 2017 1:43 pm

LtG wrote:rdos, is there a reason you want to push and pop ebp? Also why not simply leave required number of bytes (you seem to use 8 ) in the image and record these locations and patch the code at load/install time, instead of during run time? Given that normal apps shouldn't have that many syscalls I don't expect that load/install time will give much better performance but certainly won't be worse, and as an added bonus you don't have to worry/care about multicore issues with that approach.

So I'm asking what is the benefit you see with this more complicated approach?

The addition of push and pop ebp is recent. It was done to be able to reconstruct the usermode callstack when something crashed in kernel. It was especially useful for memory debugging, which has breakpoints in kernel when usermode code overruns its allocation or double frees.

As for why not to use the usermode loader to fixup syscalls at load time, the reason is that these syscalls are also used in kernel, and RDOS supports several standard executable formats, even though 32-bit flat PE is the only one that is currently in use. In fact, almost all usermode callable functions (except a few related to the usermode image itself), are usable in kernel. Things like the file API and the network stack are used in kernel drivers too. That also makes the reach of functions (usermode or kernel only) easy to redefine. If something that was originally supposed to be kernel only, but then is needed in usermode too, the function prototype is redefined and the code recompiled and then works without any other modifications.

LtG · Post by **LtG** » Tue May 16, 2017 12:13 pm

rdos wrote:The addition of push and pop ebp is recent. It was done to be able to reconstruct the usermode callstack when something crashed in kernel. It was especially useful for memory debugging, which has breakpoints in kernel when usermode code overruns its allocation or double frees.

As for why not to use the usermode loader to fixup syscalls at load time, the reason is that these syscalls are also used in kernel, and RDOS supports several standard executable formats, even though 32-bit flat PE is the only one that is currently in use. In fact, almost all usermode callable functions (except a few related to the usermode image itself), are usable in kernel. Things like the file API and the network stack are used in kernel drivers too. That also makes the reach of functions (usermode or kernel only) easy to redefine. If something that was originally supposed to be kernel only, but then is needed in usermode too, the function prototype is redefined and the code recompiled and then works without any other modifications.

Thanks, makes sense.. I was thinking why do all that extra effort, but for your design it probably makes sense. For a micro-kernel there shouldn't be much/any reason to call into userland (where drivers are), so runtime syscall "fixups" aren't really that useful for me.

OSDev.org

inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?

Re: inline system calls?