A few general questions about user mode design

AaronMiller · Post by **AaronMiller** » Sat May 23, 2009 3:45 pm

Well, I've searched through these forums and the wiki (and google, of course) but I couldn't find much information on user mode, and the accepted way of handling entering user mode, and handling system calls. -- or at least the most common or efficient way --

I've read about SYSENTER and SYSEXIT, and it seems to me like these would be a much better way of handling system calls than through an interrupt. I've read somewhere that is how Windows is handling system calls (XP I think, dunno about Vista/7.) So I'm just wondering if theres any reason that I shouldn't use SYSENTER/SYSEXIT to handle system calls, or if there's a quick alternative to SYSENTER/SYSEXIT and interrupts. Perhaps I could provide two ways of handling system calls, interrupts and SYSENTER/SYSEXIT (though that'd probably be a bit bloated.) So, basically my question for this is: Should I use SYSENTER/SYSEXIT, interrupts, or something else for system calls?

My other question seems a bit simpler -- Is it fine to setup entering into user mode by simply setting a task's segments (CS, DS, SS, etc?) -- In that way, when the IRET instruction gets called, it'll enter the task in question into user mode automatically, rather than some other method. What I mean by this is that it'd be implemented on the multitasking level, rather than through its own source files and its own methods.

Thanks.

Cheers,
-naota

bewing · Post by **bewing** » Sat May 23, 2009 5:06 pm

There have not been enough practical, working, OS designs to do a statistical analysis yet of what method of handling syscalls is "best". You get to guess.

Besides SYSENTER, another very common method of implementing syscalls is to send IPC messages to a syscall handler (or more than one) at a known IPC address. Another less common method is to queue usermode syscalls, and then batch process them when the scheduler decides that the time is right.

Typically the SYSENTER method seems to work well, and has very little overhead. The common suggestion is to (as you said) implement multiple syscall methods and let the application designer decide which to use.

Love4Boobies · Post by **Love4Boobies** » Sat May 23, 2009 11:08 pm

The downside to SYSENTER/SYSEXIT is that they're not always supported. You should probably use them when they're supproted and fall back to some other mechanism when they're not. IIRC, there was a topic in which Brendan described some differences between AMD's implementation and Intel's with regard to when these instructions are actually supported. You should probably go ahead and check both Intel's and AMD's manuals.

Brendan · Post by **Brendan** » Sun May 24, 2009 12:21 pm

Hi,

Love4Boobies wrote:The downside to SYSENTER/SYSEXIT is that they're not always supported. You should probably use them when they're supproted and fall back to some other mechanism when they're not. IIRC, there was a topic in which Brendan described some differences between AMD's implementation and Intel's with regard to when these instructions are actually supported. You should probably go ahead and check both Intel's and AMD's manuals.

IIRC it goes like this...

If an AMD CPU says it supports SYSENTER, then the CPU only supports SYSENTER in protected mode (and not in long mode). If an Intel CPU says it supports SYSENTER then it supports SYSENTER in both protected mode and long mode.

If an Intel CPU says it supports SYSCALL, then the CPU only supports SYSCALL in long mode (and not in protected mode). If an AMD CPU says it supports SYSCALL then it supports SYSCALL in both protected mode and long mode.

Also, some Intel PentiumPro CPUs have a bug where CPUID says that SYSENTER is supported when it's not. To get around that, the "Intel® Processor Identification and the CPUID Instruction" application note suggests:

Intel wrote:

Code: Select all

IF (CPUID SEP bit is set)
{
     IF ((Processor Signature & 0x0FFF3FFF) < 0x00000633)
          Fast System Call is NOT supported
     ELSE
          Fast System Call is supported
}

In general, all data you get from CPUID (and the way you get that data) should be considered "CPU specific". IMHO a good OS will parse CPUID's information and construct it's own (standardized and corrected) data and good applications will ask the OS for correct information (instead of using incorrect and/or misleading information directly from CPUID). For example, for SYSENTER and SYSCALL I use 4 separate flags in my own set of feature flags ("SYSENTER32", "SYSENTER64", "SYSCALL32" and "SYSCALL64"), so that CPU detection only needs to be done properly once and all of the other code can rely on my own feature flags instead of worrying about whether the CPU is AMD or Intel (or a PentiumPro)...

Cheers,

Brendan

xenos · Post by **xenos** » Mon May 25, 2009 1:09 am

If you want to be independent of the supported system call mechanisms, you could use an approach similar to mine:

Applications, drivers, libraries etc. never use a system call instruction directly, no matter whether it is SYSENTER, SYSCALL, an old-style interrupt or whatever. Instead, the kernel provides a "user trampoline", i.e. a piece of code that is mapped into user space. The kernel itself is a shared ELF file, so I can link applications and drivers to the kernel. When they are loaded, they can call kernel functions in the "user trampoline", which finally use a system call instruction to enter kernel mode. The kernel then checks whether the instruction was performed within the trampoline code and only continues if this is true - otherwise it immediately returns to user mode. Using the trampoline is the only way to enter kernel mode. And of course, it is write-protected.

Now applications and drivers are completely independent of the supported system call mechanisms since all they have to do is to call the user trampoline provided by the kernel. The kernel, however, has to choose the right way to enter kernel mode. You can either detect the supported mechanisms at run time and choose one whenever kernel mode needs to be entered. You can also provide different sets of trampoline functions for different mechnisms and choose the right ones when the application is loaded and linked to the kernel. Finally, you can also choose the system call mechanism at compile time using some config.h file and compile "optimized" kernels for different CPUs, which include only one system call mechanism.

Love4Boobies · Post by **Love4Boobies** » Mon May 25, 2009 1:32 am

All OSes provide libraries which do the actual systems calls.

NickJohnson · Post by **NickJohnson** » Mon May 25, 2009 5:32 am

Well Xen, doesn't that needlessly take power away from the usermode program? The system calls are in general going to be done by a library anyway, and that poses no security risk regardless of how you do them. It's just that in the more usual design, you are allowed to do system calls directly, and if the "trampoline" library happens to have significant overhead on a call, you cannot avoid that overhead. I would just let the C library do the calls - it's simpler and equally portable. You could let the kernel give some information about system call handling in a special piece of address space so the library would know which type of call to use.

Although, I guess my design shares some of that idea. I have a read only library mapped in all address spaces that does, at least by default, all system calls and signal handling. This is because my real ABI is tricky to use in some cases, and I have to have default handlers for signals (in case of real errors). However, the program may override nearly everything the library does if it thinks it can do it faster/better or use it for something new. My philosophy is that you should give as much low level control to the user program as possible, but never force the added complexity upon it.

xenos · Post by **xenos** » Tue May 26, 2009 12:46 am

I guess that's a matter of philosophy

The trampoline code in my kernel is really not much overhead. It simply checks the passed arguments for valididy and performs the actual system call. Of course it would also be possible to put them into some library and I think there is not much of a difference between my approach and a (dynamically linked) library.

In contrast, using a statically linked library would break my concept of portability since I would have to recompile all applications once I use a different kernel interface. So far, dynamically linking the application to some dynamic object, which may be the kernel trampoline or some other library, seems to be the most portable solution to me.

Nevertheless, I like your idea of an shared library that may be overridden by applications. This gives the possibility (and no necessity) to create "optimized" (and maybe less portable) applications, while in my approach all of this optimization and non-portability is encapsulated in the kernel, making applications highly portable.

NickJohnson · Post by **NickJohnson** » Wed May 27, 2009 5:29 pm

Well, I wasn't trying to say that the library would have much overhead, obviously not any more than a normal library, but that you can't avoid that overhead, big or small. Although, I was more wondering why you had the library static in each address space - I guess it could save a significant amount of memory, which would be a good reason.

I also guess I have a bit of a different view of portability than you. I'm a Gentoo user, so I'm source oriented instead of binary oriented. As long as the code will compile for any given installation, I consider it portable. Regardless, I don't think either of our approaches would break binary compatibility at any point.

xenos · Post by **xenos** » Thu May 28, 2009 4:19 am

NickJohnson wrote:Well, I wasn't trying to say that the library would have much overhead, obviously not any more than a normal library, but that you can't avoid that overhead, big or small. Although, I was more wondering why you had the library static in each address space - I guess it could save a significant amount of memory, which would be a good reason.

The whole trampoline fits into a single 4kB page, so mapping it into every address space does not waste too much memory. Of course, mapping larger libraries into every address space would waste a huge amount of memory, which is certainly not what I want.

I also guess I have a bit of a different view of portability than you. I'm a Gentoo user, so I'm source oriented instead of binary oriented. As long as the code will compile for any given installation, I consider it portable. Regardless, I don't think either of our approaches would break binary compatibility at any point.

Of course, that's a matter of taste

From my point of view the OS is some kind of abstraction layer that keeps application developers away from the necessity of dealing with different hardware implementations. In your approach, as far as I can see, this abstraction takes place in the C library, which encapsulates the system calls, so both approaches allow portable application binaries. In addidion, your approach offers the possibility to application developers to actually deal with different hardware implementations.

So, if I see this correctly, if at some point you decide to change the ABI of your kernel, and some application uses this ABI directly, bypassing the C library, the application needs to be adapted to the new ABI? This is the only (small) disadvantage I can see, at least for my OS, which is in an experimental stage and the ABI might unpredictably change.

NickJohnson · Post by **NickJohnson** » Thu May 28, 2009 4:41 am

Well, my ABI is extremely compact, so it should be reasonably stable. I have 8 total system calls, most of which only take a couple of arguments. However, there are also regions of memory that control things, like the signal handler table, that are in a fixed position. The issue is that everything is very low level, which is why I provide the library to manage them.

And I was actually thinking that mapping a library in all address spaces would save memory, because you could just have the kernel keep those frames mapped across process creation - that's what I'm doing with my setup. So the more complex the library is, the better, because then you save that space in the libc and mapping it once is a negligible cost.

xenos · Post by **xenos** » Thu May 28, 2009 6:45 am

Sorry, I got that last point wrong in your previous post. Of course it would be a waste of space to statically link the C library or any kernel libraries to executables as they can more efficiently be mapped into every address space. In fact, this is exactly what my OS does. There is some high level library which plays the role of the C library and is mapped into every address space. It contains the user level functions, like user heap management, as well as high level functions for things like file system access, which is implemented by sending an IPC to the corresponding drivers. The kernel functions, however, are in the user trampoline of the kernel binary - for example, functions for IPC, thread creation and destruction...

Of course, if your ABI is stable, there is no reason for not using it in applications.

Troy Martin · Post by **Troy Martin** » Thu May 28, 2009 8:47 am

There's always the hangover method of a syscall interrupt or two. Easy to configure and supported everywhere.

$0.02

OSDev.org

A few general questions about user mode design

A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design

Re: A few general questions about user mode design