Syscalls versus Call Gates

Jeko · Post by **Jeko** » Wed Jul 16, 2008 12:01 pm

jal wrote:
Jeko wrote:I think managed code MUST be slower than machine code. It's normal.
No, not necessarily. Managed code is checked once, to verify that it doesn't do anything dangerous. After that, it can be run at full speed.

JAL

Infact, instead machine code isn't checked...

jal · Post by **jal** » Wed Jul 16, 2008 1:17 pm

Jeko wrote:Infact, instead machine code isn't checked...

Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...

JAL

inx · Post by **inx** » Wed Jul 16, 2008 2:58 pm

Brendan wrote:Given the choice between software interrupts, call gates, SYSENTER and SYSCALL, why not just implement all of them?

IIRC, Windows (NT at least, not sure about the DOS shell crap OSes) does just this. There's a page mapped in to each process that contains a table of the relevant methods for syscalls. The processes simply jump to the proper offset in that page, and depending on the CPU, the table will contain SYSCALL, SYSENTER, or CALL FAR instructions. Kind of reminds me of the Kernal(sic) jump table on 8-bit Commodore machines. =p

Jeko · Post by **Jeko** » Wed Jul 16, 2008 3:27 pm

jal wrote:
Jeko wrote:Infact, instead machine code isn't checked...
Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...

JAL

Sorry...But:
managed code is checked and after it can run
native code run without any checking.

It's sure that native code is faster than managed code.
Where am I wrong?

Brendan · Post by **Brendan** » Wed Jul 16, 2008 4:15 pm

Hi,

Jeko wrote:I know that INT 3 is the Breakpoint Exception...
But:
Brendan wrote:The "INT3" instruction is only one byte (which makes it the smallest possible option) but it's probably better to use it for debugging purposes.
INT 3 is the breakpoint exception, but I can use it how I want.

Yes.

Jnc100 is right - INT3 is intended for debuggers to use, so that any instruction can be easily replaced with a breakpoint.

However, if you don't care if debuggers are able to do that (or if you don't want debuggers to be able to do that) then using INT3 can make sense. For example, for my OS's boot code I normally do use INT3 for a "boot API". This is because I do my debugging with Bochs and don't want other people to do their own debugging with INT3, because the code is only run once during boot (size matters more) and because I start using this API before I've done CPU detection (working out if SYSCALL/SYSENTER is supported is more annoying than it's worth in this case).

Jeko wrote:However when an invalid opcode exception occurs how can I check that it's a SYSCALL/SYSENTER or a SYSRET/SYSEXIT?

That's easy enough. Let's start from the start...

Regardless of which method/s you use, it's likely you'll have many kernel API functions, and the caller will put the kernel function number into a register before calling the kernel API. Basically the caller does something like:

Code: Select all

    mov eax, functionNumber
    <int3, int n, call far, syscall, or sysenter>

And inside the kernel you've got a table of function pointers, and probably end up doing something like:

Code: Select all

    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]

Now, if they use SYSCALL or SYSENTER and the CPU doesn't support it then you'll get an invalid opcode exception. The CPU will put the return EIP on the stack, so your invalid opcode exception handler can get the return EIP from the stack and use it to see which opcodes were used to cause the invalid opcode exception. The opcode for SYSCALL is "0x0F 0x05", and the opcode for SYSENTER is "0x0F 0x34", so you'd do something like:

Code: Select all

    if( *returnEIP == 0x0F) {
        if( *(returnEIP + 1) == 0x05) {
            /* Emulate SYSCALL */
        } else if( *(returnEIP + 1) == 0x34) {
            /* Emulate SYSENTER */
        }
    }

Then, to emulate SYSCALL you'd cheat. You're already in CPL=0 because of the exception and the function number they wanted to use is/was in EAX, so you can just do:

Code: Select all

    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    iretd

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    iretd

Emulating SYSENTER is a little harder, because the return EIP needs to come from EDX and the return ESP needs to come from ECX. This still isn't too hard though - you just copy these registers into the exception handler's stack:

Code: Select all

    mov [esp], edx         ;Set new return EIP
    mov [esp+12], ecx      ;Set new return ESP
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    iretd

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    iretd

So, as a summary, the entire thing (for software interrupts, call gates, SYSCALL, SYSENTER, emulated SYSCALL and emulated SYSENTER) would end up something like:

Code: Select all

kernelAPI_softwareInterruptHandler:
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    iretd

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    iretd



kernelAPI_callGateHandler:
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    retf

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    retf



kernelAPI_syscallHandler:
    sti
    push edx
    mov edx,esp
    mov esp,<addressOfThisThreadsKernelStack>
    push edx
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    pop esp
    pop edx
    sysret

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    pop esp
    pop edx
    sysret



kernelAPI_sysenterHandler:
    sti
    mov esp,<addressOfThisThreadsKernelStack>
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    sysexit

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    sysexit



invalidOpcodeExceptionHandler:
    push edx
    mov edx,[esp+4]         ;edx = return EIP (address of invalid opcode)
    cmp byte [edx],0x0F     ;Could it have been SYSCALL or SYSENTER?
    je .unknownInstruction  ; no
    cmp byte [edx+1],0x0F   ;Was it SYSCALL?
    je .emulateSYSCALL      ; yes
    cmp byte [edx+1],0x34   ;Was it SYSENTER?
    je .emulateSYSENTER      ; yes
.unknownInstruction:
    pop edx
    /* Do "Blue screen of death" or something here! */
    jmp $

.emulateSYSCALL:
    pop edx
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    iretd

.emulateSYSENTER:
    pop edx
    mov [esp], edx         ;Set new return EIP
    mov [esp+12], ecx      ;Set new return ESP
    cmp eax,highestSupportedFunctionNumber
    ja .badFunctionNumberError
    call [kernelAPItable + eax * 4]
    iretd

.badFunctionNumberError:
    mov eax,ERROR_badFunctionNumber
    iretd

Note1: None of this is tested (it's an *example* not production code).

Note2: There's a lot of setup stuff I skipped, like creating the IDT entry for the software interrupt, creating the GDT entry for the call gate, and setting MSRs for SYSENTER (if supported) and SYSCALL (if supported).

After you've got that working, the only thing you need to do is add your kernel API functions to the "kernelAPItable" table. The caller can use any method they like, and the kernel doesn't care which method they use.

Cheers,

Brendan

Colonel Kernel · Post by **Colonel Kernel** » Thu Jul 17, 2008 12:00 am

Jeko wrote:managed code is checked and after it can run
native code run without any checking.

It's sure that native code is faster than managed code.
Where am I wrong?

You're oversimplifying. Once running, managed code is machine code. Verification has a cost, yes, but it is a one-time cost. Once an executable has been checked it need not be re-checked unless it changes. On the other hand, using software isolation (as implemented in Singularity) instead of hardware isolation has several performance benefits:

No privilege-level switching overhead for calling the kernel.
No overhead of TLB misses when switching between processes, assuming that paging is not being used.
No overhead of manipulating page tables, again assuming that paging is not being used.
The managed-to-native compiler can optimize everything much more aggressively since dynamic loading is prohibited. For example, whole-program optimization and "tree shaking" to reduce the code size can really speed things up.

Read the Singularity papers and look at the benchmarks.

AJ · Post by AJ » Thu Jul 17, 2008 1:43 am

Sorry - I probably caused this confusion by mentioning managed code and JIT-ing in one breath, without mentioning verification for native compilation

grover · Post by **grover** » Fri Jul 18, 2008 2:30 pm

Jeko wrote:
jal wrote:
Jeko wrote:Infact, instead machine code isn't checked...
Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...

JAL
Sorry...But:
managed code is checked and after it can run
native code run without any checking.

It's sure that native code is faster than managed code.
Where am I wrong?

As a developer on several managed operating systems I'd like to try to give you another viewpoint:

1. Your C-kernel (or ASM kernel for that matter) largely depends on the compiler settings or manual optimizations for a specific architecture, e.g. you benefit from specific features of a *single* processor revision. You can *not* optimize the code right away for the newest processor, which in effect causes you to either write certain procedures multiple times optimized for certain processors each or to *not* benefit from newer processors at all. The first one costs space, the second one makes *your* code slow. A JIT in contrast knows the processor at hand and can finetune generated code for the platform it runs on, e.g. it doesn't need to emit int+sysenter/syscall+far calls for system calls, because it knows which one is fastest on its processor.

2. Due to your missing knowledge of the target machine you'll run on at compile time you can only optimize for the worst case. This in most cases makes highly iterative code very slow, as the compiler usually makes bad choices with respect to loop unrolling.

3. The verification process does cost time. But this time is highly saved in orders of magnitude at the real runtime of an application. There's lots of benchmarks on the net, which proove this. Why do you think are most time critical, highly loaded business apps written in java or .net? Because they tend to outperform generic C/C++ code in their circumstances.

4. After verification you can turn off certain protection features offered by the MMU and thus achieve higher total throughput on memory operations.

5. You are not required to protect processes from each other as you can guarantee that they will not interfere with each other.

Face it: The future are managed operating systems - no matter which language or system they're written in.

bewing · Post by **bewing** » Fri Jul 18, 2008 6:22 pm

grover wrote: Face it: The future are managed operating systems - no matter which language or system they're written in.

I disagree.

The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.

You want to pre-verify the operation of a piece of software just once, and then as long as the filesystem says that the file has not been modified, you think it's OK to run it?

Good luck. I'll be waiting with a stopwatch to see how often your OS crashes. In a perfect world, a managed OS would be a nice thing. In the real world that we actually live in, however ....

thooot · Post by **thooot** » Sat Jul 19, 2008 12:22 am

Where are these benchmarks that show managed code is faster than native? You can probably find some special cases where this is true but I doubt it's true in general. One obvious performance issue with managed code is the automated memory management. Many implementations will have a garbage collector periodically kick in. This can have a high overhead, particularly if they use a "stop the world" implementation.

Also the OS design will have a much bigger impact on the performance (and in particular the scalability) than whether you use native or managed code. The future is multi-core and so scalability is more important than raw path length.

Colonel Kernel · Post by **Colonel Kernel** » Sat Jul 19, 2008 12:42 am

bewing wrote:The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.

Very true, and liable to get worse as the number of cores per CPU increases in the coming years. This is one reason why a hybrid approach might make more sense than pure "managed" or "native".

Now that I mention it, can we stop calling them "managed" and "native"? These terms don't really fit what we're talking about here, which is really software isolated processes versus hardware isolated processes.

@thooot: There are some benchmark results in the Singularity research papers, for what it's worth. Also, GC is a lot better now than it was a decade ago. "Stop the world" is not used very often any more. For example, the GC in Mac OS X 10.5 is a concurrent collector, and Apple is working on making it scale very well on multi-core machines. Similarly, the in-kernel GC in Singularity is fully concurrent and real-time.

grover · Post by **grover** » Sat Jul 19, 2008 8:55 am

bewing wrote:
grover wrote: Face it: The future are managed operating systems - no matter which language or system they're written in.
I disagree.

That's ok. Lets have a good discussion about it

bewing wrote: The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.

I disagree with this point.

There's no difference in the assumptions between a classical OS and one of the "managed" approaches. In fact the error propagation is actually promoted in the managed operating systems by the availability of exceptions at the core runtime and language level. Yes you can achieve the same thing in C++ or in C with error codes, but it is easy to "forget" to return an error code and to swallow it somewhere where it shouldn't be. This can't happen with exceptions. The other point I'd like to make here is that with the classical error code approach code becomes littered with conditionals, e.g. the ideal path of code can't just run, it needs to check for potential errors all the time, which again makes classic code slower.

The other part of this is: How's the approach of a classic operating system better at this point, that it is able to protect itself from hiccups, undetectable bit errors and gamma rays? It doesn't have a possibility here either.

bewing wrote: You want to pre-verify the operation of a piece of software just once, and then as long as the filesystem says that the file has not been modified, you think it's OK to run it? Good luck. I'll be waiting with a stopwatch to see how often your OS crashes. In a perfect world, a managed OS would be a nice thing. In the real world that we actually live in, however ....

Actually the specifications for both Java and .NET (and I'm reasonably sure other systems too) require you to do these verification stages everytime you *load* code. It is an integral step in the loader before a JIT compiler can take over to even compile the intermediate code to native code. So a bit error on disk will most likely give a loader error if one of the following cases occurs:

- invalid instruction
- invalid sequence of instructions
- invalid metadata
- invalid method headers

AFAIK no current operating system provides a load-time verification check on the integrity of a binary file. File systems do provide this sometimes, but not all of them. I'm not sure if ELF even provides some sort of CRC or hash value an operating system may use to check the integrity. I know PE has a checksum, which isn't used by Windows AFAIK.

Hardware isolated processes do not give you any advantage at this point. How would they?

bewing · Post by **bewing** » Sat Jul 19, 2008 11:04 am

The point is that, to make a typical managed OS run fast, you have to turn off the hardware isolation, as Colonel Kernel says -- making it a "pure software isolated system". What the hardware isolation does is to stop a snowballing error chain (due to an error that has already occurred) from crashing the rest of your system. If you turn it off, every error that is undetected can easily crash everything. And as I said, code that is sitting in memory for a month, or a year, can still become corrupted while it's just sitting there. Gamma rays are mean things, if you are a transistor. Load time verification is very nice (my custom filesystem has crc32 verification) -- but no matter what verification method you use, errors can still occur afterwards. Hardware isolation is not there just to help you stop software problems -- it's there to stop hardware problems, too.

But if you leave the hardware isolation turned on, then the savings on machine time are pretty minimal, at best.

grover · Post by **grover** » Sat Jul 19, 2008 11:41 am

I very much agree with you. There are other reasons for hardware isolation too, but I still very much believe that managed operating systems have a lot more advantages even with hardware isolation turned on.

FYI, at least SharpOS will run with activated hardware isolation.

dr_evil · Post by **dr_evil** » Fri Feb 13, 2009 10:35 am

thooot wrote:Where are these benchmarks that show managed code is faster than native? You can probably find some special cases where this is true but I doubt it's true in general.

Check the papers on Singularity.

OSDev.org

Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates

Re: Syscalls versus Call Gates