Adding microkernel interface to RDOS

rdos · Post by **rdos** » Mon Dec 17, 2012 12:21 am

Interesting design: http://i30www.ira.uka.de/~neider/edu/mk ... index.html.

Much of the design could be used as a starting-point for integrating a microkernel-like interface into RDOS.

For instance, the syscall interface from 64-bit applications would need to add some variables (in registers R8-R15) describing parameters so that pointers could be copied (with or without paging) to the lower 4G address space. Something similar to "strings" in L4 could be used.

The interface from a syscall in kernel to a 64-bit device driver could use a similar interface as well. Maybe I'll put 64-bit device-drivers at ring 1 or 2 instead of ring 3, mostly because such devices should be able to access the kernel-only syscalls, while applications should not.

One thing I think is not so good is to always use IPC. Looking at my current interface, there are quite a few server threads that looks like they could be replaced by an IPC mechanism. However, on closer inspection, most of the syscalls that interacts with server threads needs to do things before they interact, and typically also do things after they interact. Also, some syscalls block waiting for the server thread, while others don't. L4 have optimizations for handing over timeslice, and doing partial schedules. However, I really wonder why this is really necessary? Why not let the original thread migrate to the device-driver address space, and itself call the syscall entrypoint in "userland"? That would speed up the call considerably, mostly because the scheduler will not become involved at all. It would also allow multiple threads to execute in the device at the same time without starting multiple server-threads.

Another thing I don't like in L4 is timeouts. Why would anybody want to slow-down syscalls with timeouts when they really shouldn't be necessary? I think it is much better to ensure that server threads and functions requested are guaranteed to exist and work. This can be ensured in a similar manner as today's gate registrations. 64-bit device drivers would simply need to define entry-points and pointer parameters on startup, and the syscalls could only be executed when this information is present (that's another advantage of not using IPC).

cxzuk · Post by **cxzuk** » Mon Dec 17, 2012 5:02 am

What is RDOS currently, and what do you want it to become?

Mikey

rdos · Post by **rdos** » Mon Dec 17, 2012 5:09 am

cxzuk wrote:What is RDOS currently, and what do you want it to become?

It uses a monolithic kernel that is protected with segmentation. I aim to add 64-bit support for applications in the first stage, and then also some 64-bit device drivers. It is possible that in a very distant future, there will be a clean 64-bit version, but I anticipate that 64-bit will be using the existing 32-bit kernel & drivers for a long time. The APIs (both userland and kernel) will be adopted to 64-bit, but still will use the same registers and syntax. Mainly because I find the interface good and adequate. All APIs will continue to be register based.

zeitue · Post by **zeitue** » Mon Dec 17, 2012 6:59 pm

@rdos the XNU kernel by apple might be something to look at because it can run in both 32 bit and 64 bit modes as well as being a hybrid kernel.

Mach Kernel Interface Reference Manual
XNU source code

rdos · Post by **rdos** » Tue Dec 18, 2012 9:26 am

On further examination of microkernels I'm starting to believe that the design I want would not classify as a microkernel, but so what?

What I want is to run device drivers in separate address spaces, so they cannot affect each others or the kernel. I'm not interested in the costly IPC-mechanism, if it is not necessary (which I don't think it is). I'm also not changing the syscall-interface I have today, but rather want to extend it with 64-bit device-drivers written in C (or C++).

The current syscall interface defines an entry-point number. Currently, there are about 500 entry points for kernel/device-driver level and 500 entry points for applications. It also defines which registers are used to pass pointers, including segment registers. Flat application code use selector 0x20 (kernel) or 0x1BB (applications).

When operating over address-spaces switches, a new parameter is required, which defines how pointers are passed. Pointers can be of two types: strings or buffers. Strings are null-terminated, and the size must be determined by looking for a trailing 0. Buffers have a register that gives the size. These aspects have similarities to IPCs for micro-kernels. However, I don't want the overhead of message passing and task-switching. I just want to copy page table entries to the device-driver address space, and adjusting the pointers.

I think the new parameter will be supplied to a new (64-bit) registration function which registers the entry-point, CR3 and how pointers are passed. The registration function will fill-in a new entry for the function, link it to a default stub that does the pointer conversion, and calls the new address space. It is possible there are a number of different stubs which have hard-coded pointer conversions, and that after pointer conversion calls a common function which calls the 64-bit handler (with an iretq). This would be a good optimization, since the API only contains a small number of different ways of passing pointers.

Another problem is to provide a stack for the device-driver. I think this can be done by reserving 1G or something of 64-bit memory, and using it for allocating device-driver application stacks. These could be global. Since the device-driver itself could call other syscalls (which might be implemented in other address spaces), there is a need to provide a chain-function for the stack (much like the chain function required for DPMI and DOS-extenders).

The 64-bit application interface actually could use a very similar method. For every (supported) syscall, there also needs to be a specification of how pointers are passed. Then the syscall code would simply call the pointer conversion code through a table, similar to how the 64-bit device drivers could be called with different stubs. In fact, the conversion methods could be a global constant shared between these two uses. That would eliminate the need to load registers with conversion methods in 64-bit user space, something that could be manipulated.

Additionally, the syscall interface could optimize performance when the server is in a 64-bit device driver by directly invoking the device-driver without going through legacy mode kernel code.

In fact, I think the net result would be a much faster system than any micro-kernel, even if it is a mixed 32/64 bit system.

Owen · Post by **Owen** » Tue Dec 18, 2012 11:32 am

What you've defined is, essentially, a slow IPC mechanism. If address space switches are involved, it is IPC.

You're essentially designing a mechanism identical to L4's synchronous IPC, except L4 IPC only has 3 formats:

Register, in which N processor registers plus M "Virtual" registers are passed. The virtual registers are generally anchored in some per-thread structure that is easily located; on x86, this structure can normally be found at GS:0 (and GS:0 or GS:-sizeof(void*) is normally the flat address of the structure)
String, in which a buffer of defined size is copied by the kernel
Page, in which a number of pages are transferred

Most L4 implementations come with an IDL compiler, to avoid people having to implement their IPC stubs by hand.

bluemoon · Post by **bluemoon** » Tue Dec 18, 2012 12:20 pm

If you are not desperately doing mirco-kernel, why not just map the drivers into kernel space? for 64-bit system there is plenty of room, and there are usually limited number of drivers, I would say an address zone of 256G is generally enough. This may eliminate the need of IPC for such purpose. Doing it fast is not fast, doing nothing is fast.

rdos wrote:What I want is to run device drivers in separate address spaces, so they cannot affect each others or the kernel.

It's debatable that anything can protect a ring0 driver going wrong - you may protect it against corrupting other process, but once such failure caught you usually can do nothing but reboot; furthermore there are still many other things can go badly which is not protected by address space isolation.
For ring1 driver it's slow already due to ring switch - you should not look for speed in this direction.

rdos · Post by **rdos** » Tue Dec 18, 2012 4:30 pm

bluemoon wrote:If you are not desperately doing mirco-kernel, why not just map the drivers into kernel space? for 64-bit system there is plenty of room, and there are usually limited number of drivers, I would say an address zone of 256G is generally enough. This may eliminate the need of IPC for such purpose. Doing it fast is not fast, doing nothing is fast.

Randomly spreading the device-driver code all over the 48-bit address space is an alternative. After all, the top 16 bits of the 48-bit address could be used the same way a 16-bit selector + 32-bit offset can be combined into a 48-bit address. It could provide similar protection. The main problem is that then the 0-selector (IOW the first 4G) should be invalid. That's possible to do for an application (let the first 512G be kernel only), but it is not as easy for a device-driver.

bluemoon wrote:It's debatable that anything can protect a ring0 driver going wrong - you may protect it against corrupting other process, but once such failure caught you usually can do nothing but reboot;

You could log the error (or in my case, stop the thread so you can inspect it in kernel-debugger).

bluemoon wrote:furthermore there are still many other things can go badly which is not protected by address space isolation.
For ring1 driver it's slow already due to ring switch - you should not look for speed in this direction.

Maybe if PML4 entries are substituted there is no need for a address-space switch? But is it possible to flush an entire 512G range? Wouldn't that take just as long as a CR3 reload?

rdos · Post by **rdos** » Tue Dec 18, 2012 4:43 pm

Owen wrote:Register, in which N processor registers plus M "Virtual" registers are passed. The virtual registers are generally anchored in some per-thread structure that is easily located; on x86, this structure can normally be found at GS:0 (and GS:0 or GS:-sizeof(void*) is normally the flat address of the structure)

In my design, only 32-bit x86 registers could be used for parameter passing. Since x86-64 has all these registers + 64-bit versions of them and some other registers, there is never any need for virtual registers.

Owen wrote: Most L4 implementations come with an IDL compiler, to avoid people having to implement their IPC stubs by hand.

I don't think there will be many stubs. A guess is that less than 10 stubs will cover the entire API with close to 1,000 functions.

zeitue · Post by **zeitue** » Tue Dec 18, 2012 10:36 pm

@rdos perhaps a bit of a boxing idea for your drivers?
As in you could have drivers run in kernel mode or user mode depending on the drivers stability and speed requirements.
This is just and idea it might not work at all I don't know, but if you have a clean well defined driver API allowing drivers to run in kernel mode. Then have an emulation layer or glue code what ever you wish to call it that works kind of like a user space kernel exporting the kernel resources to it. This would allow for the driver to be loaded in kernel mode or loaded into a user space driver box

RUMP this is something similar but it works with any kernel component.

rdos · Post by **rdos** » Wed Dec 19, 2012 2:15 am

zeitue wrote:@rdos perhaps a bit of a boxing idea for your drivers?
As in you could have drivers run in kernel mode or user mode depending on the drivers stability and speed requirements.

They have two options:
1. Run in kernel-mode with a segmented memory model (OpenWatcom 32-bit compact memory model)
2. Run in user-mode with a flat memory model (GCC long mode)

I don't trust flat memory model code for the kernel. That's also why I'm looking at microkernels.

The primary candidate for user mode drivers are file systems. Today they are mixed-bitness, with some segmentation and some flat code. They would be more maintainable in C, and because file system objects cannot be mapped to distinct selectors, the best option is a flat memory model.

OSDev.org

Adding microkernel interface to RDOS

Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS

Re: Adding microkernel interface to RDOS