Infact, instead machine code isn't checked...jal wrote:No, not necessarily. Managed code is checked once, to verify that it doesn't do anything dangerous. After that, it can be run at full speed.Jeko wrote:I think managed code MUST be slower than machine code. It's normal.
JAL
Syscalls versus Call Gates
Re: Syscalls versus Call Gates
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling
http://sourceforge.net/projects/jeko - Jeko Operating System
http://sourceforge.net/projects/jeko - Jeko Operating System
Re: Syscalls versus Call Gates
Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...Jeko wrote:Infact, instead machine code isn't checked...
JAL
Re: Syscalls versus Call Gates
IIRC, Windows (NT at least, not sure about the DOS shell crap OSes) does just this. There's a page mapped in to each process that contains a table of the relevant methods for syscalls. The processes simply jump to the proper offset in that page, and depending on the CPU, the table will contain SYSCALL, SYSENTER, or CALL FAR instructions. Kind of reminds me of the Kernal(sic) jump table on 8-bit Commodore machines. =pBrendan wrote:Given the choice between software interrupts, call gates, SYSENTER and SYSCALL, why not just implement all of them?
Re: Syscalls versus Call Gates
Sorry...But:jal wrote:Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...Jeko wrote:Infact, instead machine code isn't checked...
JAL
managed code is checked and after it can run
native code run without any checking.
It's sure that native code is faster than managed code.
Where am I wrong?
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling
http://sourceforge.net/projects/jeko - Jeko Operating System
http://sourceforge.net/projects/jeko - Jeko Operating System
Re: Syscalls versus Call Gates
Hi,
Jnc100 is right - INT3 is intended for debuggers to use, so that any instruction can be easily replaced with a breakpoint.
However, if you don't care if debuggers are able to do that (or if you don't want debuggers to be able to do that) then using INT3 can make sense. For example, for my OS's boot code I normally do use INT3 for a "boot API". This is because I do my debugging with Bochs and don't want other people to do their own debugging with INT3, because the code is only run once during boot (size matters more) and because I start using this API before I've done CPU detection (working out if SYSCALL/SYSENTER is supported is more annoying than it's worth in this case).
That's easy enough. Let's start from the start...
Regardless of which method/s you use, it's likely you'll have many kernel API functions, and the caller will put the kernel function number into a register before calling the kernel API. Basically the caller does something like:
And inside the kernel you've got a table of function pointers, and probably end up doing something like:
Now, if they use SYSCALL or SYSENTER and the CPU doesn't support it then you'll get an invalid opcode exception. The CPU will put the return EIP on the stack, so your invalid opcode exception handler can get the return EIP from the stack and use it to see which opcodes were used to cause the invalid opcode exception. The opcode for SYSCALL is "0x0F 0x05", and the opcode for SYSENTER is "0x0F 0x34", so you'd do something like:
Then, to emulate SYSCALL you'd cheat. You're already in CPL=0 because of the exception and the function number they wanted to use is/was in EAX, so you can just do:
Emulating SYSENTER is a little harder, because the return EIP needs to come from EDX and the return ESP needs to come from ECX. This still isn't too hard though - you just copy these registers into the exception handler's stack:
So, as a summary, the entire thing (for software interrupts, call gates, SYSCALL, SYSENTER, emulated SYSCALL and emulated SYSENTER) would end up something like:
Note1: None of this is tested (it's an *example* not production code).
Note2: There's a lot of setup stuff I skipped, like creating the IDT entry for the software interrupt, creating the GDT entry for the call gate, and setting MSRs for SYSENTER (if supported) and SYSCALL (if supported).
After you've got that working, the only thing you need to do is add your kernel API functions to the "kernelAPItable" table. The caller can use any method they like, and the kernel doesn't care which method they use.
Cheers,
Brendan
Yes.Jeko wrote:I know that INT 3 is the Breakpoint Exception...
But:INT 3 is the breakpoint exception, but I can use it how I want.Brendan wrote:The "INT3" instruction is only one byte (which makes it the smallest possible option) but it's probably better to use it for debugging purposes.
Jnc100 is right - INT3 is intended for debuggers to use, so that any instruction can be easily replaced with a breakpoint.
However, if you don't care if debuggers are able to do that (or if you don't want debuggers to be able to do that) then using INT3 can make sense. For example, for my OS's boot code I normally do use INT3 for a "boot API". This is because I do my debugging with Bochs and don't want other people to do their own debugging with INT3, because the code is only run once during boot (size matters more) and because I start using this API before I've done CPU detection (working out if SYSCALL/SYSENTER is supported is more annoying than it's worth in this case).
Jeko wrote:However when an invalid opcode exception occurs how can I check that it's a SYSCALL/SYSENTER or a SYSRET/SYSEXIT?
That's easy enough. Let's start from the start...
Regardless of which method/s you use, it's likely you'll have many kernel API functions, and the caller will put the kernel function number into a register before calling the kernel API. Basically the caller does something like:
Code: Select all
mov eax, functionNumber
<int3, int n, call far, syscall, or sysenter>
Code: Select all
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
Now, if they use SYSCALL or SYSENTER and the CPU doesn't support it then you'll get an invalid opcode exception. The CPU will put the return EIP on the stack, so your invalid opcode exception handler can get the return EIP from the stack and use it to see which opcodes were used to cause the invalid opcode exception. The opcode for SYSCALL is "0x0F 0x05", and the opcode for SYSENTER is "0x0F 0x34", so you'd do something like:
Code: Select all
if( *returnEIP == 0x0F) {
if( *(returnEIP + 1) == 0x05) {
/* Emulate SYSCALL */
} else if( *(returnEIP + 1) == 0x34) {
/* Emulate SYSENTER */
}
}
Code: Select all
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
iretd
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
iretd
Code: Select all
mov [esp], edx ;Set new return EIP
mov [esp+12], ecx ;Set new return ESP
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
iretd
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
iretd
So, as a summary, the entire thing (for software interrupts, call gates, SYSCALL, SYSENTER, emulated SYSCALL and emulated SYSENTER) would end up something like:
Code: Select all
kernelAPI_softwareInterruptHandler:
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
iretd
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
iretd
kernelAPI_callGateHandler:
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
retf
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
retf
kernelAPI_syscallHandler:
sti
push edx
mov edx,esp
mov esp,<addressOfThisThreadsKernelStack>
push edx
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
pop esp
pop edx
sysret
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
pop esp
pop edx
sysret
kernelAPI_sysenterHandler:
sti
mov esp,<addressOfThisThreadsKernelStack>
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
sysexit
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
sysexit
invalidOpcodeExceptionHandler:
push edx
mov edx,[esp+4] ;edx = return EIP (address of invalid opcode)
cmp byte [edx],0x0F ;Could it have been SYSCALL or SYSENTER?
je .unknownInstruction ; no
cmp byte [edx+1],0x0F ;Was it SYSCALL?
je .emulateSYSCALL ; yes
cmp byte [edx+1],0x34 ;Was it SYSENTER?
je .emulateSYSENTER ; yes
.unknownInstruction:
pop edx
/* Do "Blue screen of death" or something here! */
jmp $
.emulateSYSCALL:
pop edx
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
iretd
.emulateSYSENTER:
pop edx
mov [esp], edx ;Set new return EIP
mov [esp+12], ecx ;Set new return ESP
cmp eax,highestSupportedFunctionNumber
ja .badFunctionNumberError
call [kernelAPItable + eax * 4]
iretd
.badFunctionNumberError:
mov eax,ERROR_badFunctionNumber
iretd
Note2: There's a lot of setup stuff I skipped, like creating the IDT entry for the software interrupt, creating the GDT entry for the call gate, and setting MSRs for SYSENTER (if supported) and SYSCALL (if supported).
After you've got that working, the only thing you need to do is add your kernel API functions to the "kernelAPItable" table. The caller can use any method they like, and the kernel doesn't care which method they use.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- Colonel Kernel
- Member
- Posts: 1437
- Joined: Tue Oct 17, 2006 6:06 pm
- Location: Vancouver, BC, Canada
- Contact:
Re: Syscalls versus Call Gates
You're oversimplifying. Once running, managed code is machine code. Verification has a cost, yes, but it is a one-time cost. Once an executable has been checked it need not be re-checked unless it changes. On the other hand, using software isolation (as implemented in Singularity) instead of hardware isolation has several performance benefits:Jeko wrote:managed code is checked and after it can run
native code run without any checking.
It's sure that native code is faster than managed code.
Where am I wrong?
- No privilege-level switching overhead for calling the kernel.
- No overhead of TLB misses when switching between processes, assuming that paging is not being used.
- No overhead of manipulating page tables, again assuming that paging is not being used.
- The managed-to-native compiler can optimize everything much more aggressively since dynamic loading is prohibited. For example, whole-program optimization and "tree shaking" to reduce the code size can really speed things up.
Top three reasons why my OS project died:
- Too much overtime at work
- Got married
- My brain got stuck in an infinite loop while trying to design the memory manager
Re: Syscalls versus Call Gates
Sorry - I probably caused this confusion by mentioning managed code and JIT-ing in one breath, without mentioning verification for native compilation
Re: Syscalls versus Call Gates
As a developer on several managed operating systems I'd like to try to give you another viewpoint:Jeko wrote:Sorry...But:jal wrote:Sigh... you seem to know it better than anyone here, so no point arguing. Retreating to a place without idiots...Jeko wrote:Infact, instead machine code isn't checked...
JAL
managed code is checked and after it can run
native code run without any checking.
It's sure that native code is faster than managed code.
Where am I wrong?
1. Your C-kernel (or ASM kernel for that matter) largely depends on the compiler settings or manual optimizations for a specific architecture, e.g. you benefit from specific features of a *single* processor revision. You can *not* optimize the code right away for the newest processor, which in effect causes you to either write certain procedures multiple times optimized for certain processors each or to *not* benefit from newer processors at all. The first one costs space, the second one makes *your* code slow. A JIT in contrast knows the processor at hand and can finetune generated code for the platform it runs on, e.g. it doesn't need to emit int+sysenter/syscall+far calls for system calls, because it knows which one is fastest on its processor.
2. Due to your missing knowledge of the target machine you'll run on at compile time you can only optimize for the worst case. This in most cases makes highly iterative code very slow, as the compiler usually makes bad choices with respect to loop unrolling.
3. The verification process does cost time. But this time is highly saved in orders of magnitude at the real runtime of an application. There's lots of benchmarks on the net, which proove this. Why do you think are most time critical, highly loaded business apps written in java or .net? Because they tend to outperform generic C/C++ code in their circumstances.
4. After verification you can turn off certain protection features offered by the MMU and thus achieve higher total throughput on memory operations.
5. You are not required to protect processes from each other as you can guarantee that they will not interfere with each other.
Face it: The future are managed operating systems - no matter which language or system they're written in.
Re: Syscalls versus Call Gates
I disagree.grover wrote: Face it: The future are managed operating systems - no matter which language or system they're written in.
The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.
You want to pre-verify the operation of a piece of software just once, and then as long as the filesystem says that the file has not been modified, you think it's OK to run it? Good luck. I'll be waiting with a stopwatch to see how often your OS crashes. In a perfect world, a managed OS would be a nice thing. In the real world that we actually live in, however ....
Re: Syscalls versus Call Gates
Where are these benchmarks that show managed code is faster than native? You can probably find some special cases where this is true but I doubt it's true in general. One obvious performance issue with managed code is the automated memory management. Many implementations will have a garbage collector periodically kick in. This can have a high overhead, particularly if they use a "stop the world" implementation.
Also the OS design will have a much bigger impact on the performance (and in particular the scalability) than whether you use native or managed code. The future is multi-core and so scalability is more important than raw path length.
Also the OS design will have a much bigger impact on the performance (and in particular the scalability) than whether you use native or managed code. The future is multi-core and so scalability is more important than raw path length.
- Colonel Kernel
- Member
- Posts: 1437
- Joined: Tue Oct 17, 2006 6:06 pm
- Location: Vancouver, BC, Canada
- Contact:
Re: Syscalls versus Call Gates
Very true, and liable to get worse as the number of cores per CPU increases in the coming years. This is one reason why a hybrid approach might make more sense than pure "managed" or "native".bewing wrote:The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.
Now that I mention it, can we stop calling them "managed" and "native"? These terms don't really fit what we're talking about here, which is really software isolated processes versus hardware isolated processes.
@thooot: There are some benchmark results in the Singularity research papers, for what it's worth. Also, GC is a lot better now than it was a decade ago. "Stop the world" is not used very often any more. For example, the GC in Mac OS X 10.5 is a concurrent collector, and Apple is working on making it scale very well on multi-core machines. Similarly, the in-kernel GC in Singularity is fully concurrent and real-time.
Top three reasons why my OS project died:
- Too much overtime at work
- Got married
- My brain got stuck in an infinite loop while trying to design the memory manager
Re: Syscalls versus Call Gates
That's ok. Lets have a good discussion about itbewing wrote:I disagree.grover wrote: Face it: The future are managed operating systems - no matter which language or system they're written in.
I disagree with this point.bewing wrote: The entire concept of managed operating systems is built around the assumption that the hardware is 100% reliable -- and therefore that complete reliability can be achieved by creating a 100% reliable software layer on top of that hardware.
I've been in the computer business for 25 years, and I know for a fact that the first assumption is dead wrong. Power supplies cause hardware "hiccups". A few bytes here or there on hard disks become corrupted. The memory bus, or a memory chip start throwing undetectable bit errors. Gamma rays cause the cpu to execute the wrong instruction. And honestly that's just the tip of it.
There's no difference in the assumptions between a classical OS and one of the "managed" approaches. In fact the error propagation is actually promoted in the managed operating systems by the availability of exceptions at the core runtime and language level. Yes you can achieve the same thing in C++ or in C with error codes, but it is easy to "forget" to return an error code and to swallow it somewhere where it shouldn't be. This can't happen with exceptions. The other point I'd like to make here is that with the classical error code approach code becomes littered with conditionals, e.g. the ideal path of code can't just run, it needs to check for potential errors all the time, which again makes classic code slower.
The other part of this is: How's the approach of a classic operating system better at this point, that it is able to protect itself from hiccups, undetectable bit errors and gamma rays? It doesn't have a possibility here either.
Actually the specifications for both Java and .NET (and I'm reasonably sure other systems too) require you to do these verification stages everytime you *load* code. It is an integral step in the loader before a JIT compiler can take over to even compile the intermediate code to native code. So a bit error on disk will most likely give a loader error if one of the following cases occurs:bewing wrote: You want to pre-verify the operation of a piece of software just once, and then as long as the filesystem says that the file has not been modified, you think it's OK to run it? Good luck. I'll be waiting with a stopwatch to see how often your OS crashes. In a perfect world, a managed OS would be a nice thing. In the real world that we actually live in, however ....
- invalid instruction
- invalid sequence of instructions
- invalid metadata
- invalid method headers
AFAIK no current operating system provides a load-time verification check on the integrity of a binary file. File systems do provide this sometimes, but not all of them. I'm not sure if ELF even provides some sort of CRC or hash value an operating system may use to check the integrity. I know PE has a checksum, which isn't used by Windows AFAIK.
Hardware isolated processes do not give you any advantage at this point. How would they?
Re: Syscalls versus Call Gates
The point is that, to make a typical managed OS run fast, you have to turn off the hardware isolation, as Colonel Kernel says -- making it a "pure software isolated system". What the hardware isolation does is to stop a snowballing error chain (due to an error that has already occurred) from crashing the rest of your system. If you turn it off, every error that is undetected can easily crash everything. And as I said, code that is sitting in memory for a month, or a year, can still become corrupted while it's just sitting there. Gamma rays are mean things, if you are a transistor. Load time verification is very nice (my custom filesystem has crc32 verification) -- but no matter what verification method you use, errors can still occur afterwards. Hardware isolation is not there just to help you stop software problems -- it's there to stop hardware problems, too.
But if you leave the hardware isolation turned on, then the savings on machine time are pretty minimal, at best.
But if you leave the hardware isolation turned on, then the savings on machine time are pretty minimal, at best.
Re: Syscalls versus Call Gates
I very much agree with you. There are other reasons for hardware isolation too, but I still very much believe that managed operating systems have a lot more advantages even with hardware isolation turned on.
FYI, at least SharpOS will run with activated hardware isolation.
FYI, at least SharpOS will run with activated hardware isolation.
Re: Syscalls versus Call Gates
Check the papers on Singularity.thooot wrote:Where are these benchmarks that show managed code is faster than native? You can probably find some special cases where this is true but I doubt it's true in general.