Designs of microkernels

nexos · Post by **nexos** » Sat Jan 02, 2021 7:21 am

Hello,
I have been doing some research on microkernels to know how I should develop mine. I looked at some stuff for Mach, and it looked very powerful and bloated. I haven't gotten to Minix yet. Over the last couple days, I read a couple papers on L4 and L3, and they seem to say that Mach's poor performance was due to its buffered asynchronous IPC. L4, however, simply directly messaged a thread by context switching to it directly when a message was sent. And L4's performance, even the C++ version's, were still remarkable! What kind of design do you use in your microkernel? At this point, I'm leaning towards a more L4 like approach.

bzt · Post by **bzt** » Sat Jan 02, 2021 7:47 am

nexos wrote:What kind of design do you use in your microkernel? At this point, I'm leaning towards a more L4 like approach.

I use my own solution, but it's more like the L4 way (for small messages I use registers, CoW mappings for medium buffers and for large buffers shared mappings). I switch to the message receiver task right away too, that works pretty well for low-latency.

Cheers,
bzt

nexos · Post by **nexos** » Sat Jan 02, 2021 7:58 am

Yeah, that's what I plan on doing. I wonder why L4 is rarely used and slow old Mach gets all the credit and attention.... A few months ago, before I looked at L4, I though about such a scheme, but though it would be impossible to implement. Looks like it isn't!

thewrongchristian · Post by **thewrongchristian** » Sat Jan 02, 2021 9:50 am

nexos wrote:Yeah, that's what I plan on doing. I wonder why L4 is rarely used and slow old Mach gets all the credit and attention.... A few months ago, before I looked at L4, I though about such a scheme, but though it would be impossible to implement. Looks like it isn't!

L4, in the form of OKL4, has been deployed in the billions of units, embedded into cell radio chipsets.

Mach isn't that well deployed, especially not the microkernel version 3.0. XNU (iOS, macOS) is based on Mach 2.5, which is monolithic.

nexos · Post by **nexos** » Tue Jan 05, 2021 1:37 pm

I have been doing research, and I decided to make a microkernel still, no change their, but to put drivers and stuff in kernel space to keep performance up, and I will have a hybrid systen. Microkernel advocates will probably say that this would decrease stability and security. I, actually, after much research, think that microkernels aren't as stable and secure as people think they are. Before I get started explaining, no flaming please

.

Microkernels are said to have 3 big advantages: Modularity, stability and security. Monolithic kernels are said to be fast, and hybrid kernel are said to be fast and modular. All these claims are valid. Microkernels and hybrid kernel are much more modular, hence why I am still basing my OS around a microkernel. Monolithic kernels and hybrid kernels are faster then microkernels, unless you are an expert and are able to get close enough like L4::Hazlenut managed to, even though it was written in C++.

Microkernel's stability claim, in theory, is very valid. If a service crashes, the system keeps running. This may not be a huge deal with some servers, but with a server like, say, the filesystem or memory manager crashes, the system will probably grind to a halt when they crash anyway. Of course, there could be some way to restart services, but a crashing server may just keep crashing if it is has a bug that was uncovered. It might just be best to bring the system down when this happens. Of course, we inconvenience, but I would rather deal with a system that crashes every once in a blue moon then one that is always slow.

Microkernel's security claim, in theory, is also very valid. But if a hybrid kernel based system only allows modules to be loaded by the root user, then that makes it harder to hack. Of course, if the root user was hijacked, then we have a problem, but that would be bad enough anyway. Also, a driver is always going to need some sort of low level access. A prime example would be the driver manager. You could make privileged and trusted servers run in a trustedsystem account maybe, and third party servers run in a system account, but if that trustedsystem account is hijacked, we are back to square one.

Overall, that is my take on the whole Torvalds team vs. Tannebaums team thing. I side with neither. I say that Windows NT could have come the closest to coming right, had it not been for extreme bloat. I may research Plan 9 quite a bit, as that seems to have gotten it right.

OSwhatever · Post by **OSwhatever** » Wed Jan 06, 2021 7:39 am

nexos wrote:I looked at some stuff for Mach, and it looked very powerful and bloated.

I went a step further, I made an IPC that is really slow but very versatile but the base design is surely simpler than Mach. This makes you think how to design the interfaces. For example reading directory entries (like readdir), can read several entries in advance reducing the amount of IPC calls. You should assume that IPC calls are slow, like they are on another computer connected via a network and that helps you designing the interfaces.

moonchild · Post by **moonchild** » Wed Jan 06, 2021 2:47 pm

OSwhatever wrote:I went a step further, I made an IPC that is really slow but very versatile but the base design is surely simpler than Mach. This makes you think how to design the interfaces. For example reading directory entries (like readdir), can read several entries in advance reducing the amount of IPC calls. You should assume that IPC calls are slow, like they are on another computer connected via a network and that helps you designing the interfaces.

To the contrary. You should make IPC fast enough that it makes as much sense to put things in separate processes as in the same process. You can't get out of needing IPC, especially in a microkernel. There's been some other discussion recently of single address-space OSes, which provide one solution. The mill cpu (if it ever comes out) provides another that also lets you take advantage of hardware protection. You can also use strategies like io_uring to increase the throughput of system calls, shifting essential complexity into kernel space (where it belongs) while still providing userspace with a lot of flexibility.

eekee · Post by **eekee** » Wed Jan 06, 2021 6:23 pm

Re. the last 2 posts: I don't think latency has kept pace with throughput anywhere in modern computing, inter-x-communication is always slower than ideal. Making IPC fast is lovely if you can do it, but you will need to batch requests. I may be overgeneralizing, but I don't think IPC can break this mold. Not without lowering security, anyway.

Schol-R-LEA · Post by **Schol-R-LEA** » Wed Jan 06, 2021 6:43 pm

eekee wrote:Re. the last 2 posts: I don't think latency has kept pace with throughput anywhere in modern computing, inter-x-communication is always slower than ideal. Making IPC fast is lovely if you can do it, but you will need to batch requests. I may be overgeneralizing, but I don't think IPC can break this mold. Not without lowering security, anyway.

Synthesis (which was sort of a hybrid kernel, but an unusual one) reportedly improved IPC dramatically, but did so with a very kernel unusual design combined with batching and 'folding' (pre-computing) both system calls and serial IPC messages. While it showed an improvement on two specific platforms for both IPC and system services compared to contemporary kernel designs, AFAICT no one has tested whether it would show the same improvement on present-day stock hardware.

moonchild · Post by **moonchild** » Wed Jan 06, 2021 11:39 pm

eekee wrote:I may be overgeneralizing, but I don't think IPC can break this mold. Not without lowering security, anyway.

SASOS can be made secure and fast with a with a formally verified JIT. Here's a paper describing one. Another thing to look at in the space of formally verified compilation is llvm's alive2; though that's probably not as relevant.

If you have an SASOS, then 'IPC' can be a direct function call.

eekee · Post by **eekee** » Thu Jan 07, 2021 8:24 am

Schol-R-LEA wrote:Synthesis (which was sort of a hybrid kernel, but an unusual one) reportedly improved IPC dramatically, but did so with a very kernel unusual design combined with batching and 'folding' (pre-computing) both system calls and serial IPC messages. While it showed an improvement on two specific platforms for both IPC and system services compared to contemporary kernel designs, AFAICT no one has tested whether it would show the same improvement on present-day stock hardware.

Ooh! I've added Synthesis to my list of things to study. The concept of pre-computing always puzzles me though.

Is it like caching generated code?

@moonchild: That's basically what we were saying: with a safe language, you don't need an MMU, and without an MMU, IPC can be a function call. Thanks for the link on provably safe language technology.

AndrewAPrice · Post by **AndrewAPrice** » Thu Jan 07, 2021 10:23 am

My original design for IPCs was to try to do lightweight synchronous RPCs. The theory was:
- Register an entry point for each RPC your service can handle.
- The caller would invoke a syscall that would behave as a function call - only that the "call/return" would change address spaces.

This had many challenges:
- We'd need to create a stack in the callee every time the call is made. How big is this stack? Maybe this could be sped up with a 'stack pool' to quickly reuses tacks?
- There's language-dependent boilerplate involved in creating and destroying a thread (dealing with thread local storage, for example.) The kernel can't just start executing arbitrary C++ in a new thread.

These are solvable challenges (I'd have to register not just the handler, but the thread entrypoint that takes the handler as a parameter, and we could specify a stack size during registration.)

This approach would give us:
- Synchronous calling (the RPC functions like a function call, and won't return until the RPC returns.)
- Asynchronous handling (the RPC spawns a new thread in the callee's process, to give the illusion that it's a continuation of the caller's thread.)

So while it would feel natural (RPC that are function calls, services are just shared libraries that run in their own address space), when I think about actual use cases, I'd probably want the opposite:
- Asynchronous calling
- Synchronous handling

Implementing asychronous calling with a synchronous calling API is inefficient (I'd have to spawn a thread just for it to call one blocking function) when the opposite isn't true (it's easy cheap to block until a message is returned).

Implementing synchronous handling with an asynchronous handling API is inefficient (the kernel would have spawned a thread, just for us to lock on a mutex), when the opposite is the same (while synchronously processing the incomming message queue, we can spawn a thread to do the actual work - which is just as heavy weight as the kernel spawning the handling thread in an asynchronous handling API).

I'm not concerned about the small amount of copying for the sending messages and polling the queue (5 * 64 bit parameters + message id + destination/caller PID = 56 bytes total that can be passed around in registers.)

For larger messages, I've been building an IDL called Permebufs inspired by Protocol Buffers, Cap'n Proto, FlatBuffers, etc. The difference is that Permebuf is optimized for the write-once use case, and requires no serialization/deserialization. The underlying buffer is page aligned, so the sender "gifts" the Permebuf's memory pages to the receiver and no copying is involved.

AndrewAPrice · Post by **AndrewAPrice** » Thu Jan 07, 2021 11:02 am

eekee wrote:with a safe language, you don't need an MMU, and without an MMU, IPC can be a function call.

There are a lot of implementation details we'd need to figure out if we want to do function calls between processes.

Can you share memory between processes? (Let's say I allocate a class in Process A, can Process B access it and call methods on it?)

How do you determine what process owns a chunk of memory?

If ownership of objects can be moved, could a malicious program keep calling a critical service with a large payload?

How do you handle terminating processes? (E.g. Imagine Process A calls Process B which calls Process C, but B terminates while C is still doing something. What will happen when C returns?)

What happens if Process A implements an interface, sends the object to Process B, then A terminates, and B tries to call a virtual method on the interface?

What happens if Process A is updated and the exported function's signature changes, but Process B is a commercial closed source application and the developers haven't rebuilt it for the newer API, and it tries to call Process A with the wrong function signature?

Schol-R-LEA · Post by **Schol-R-LEA** » Thu Jan 07, 2021 11:41 am

eekee wrote:
Schol-R-LEA wrote:Synthesis (which was sort of a hybrid kernel, but an unusual one) reportedly improved IPC dramatically, but did so with a very kernel unusual design combined with batching and 'folding' (pre-computing) both system calls and serial IPC messages. While it showed an improvement on two specific platforms for both IPC and system services compared to contemporary kernel designs, AFAICT no one has tested whether it would show the same improvement on present-day stock hardware.
Ooh! I've added Synthesis to my list of things to study. The concept of pre-computing always puzzles me though. Is it like caching generated code?

Reply moved to a new thread - while Synthesis has some hybrid qualities, it is mostly monolithic, so it doesn't really fit here.

eekee · Post by **eekee** » Thu Jan 07, 2021 5:33 pm

@Schol-R-Lea: Cool, thanks!

AndrewAPrice wrote:
eekee wrote:with a safe language, you don't need an MMU, and without an MMU, IPC can be a function call.
There are a lot of implementation details we'd need to figure out if we want to do function calls between processes.

That is true. "Just" passing a pointer or calling a function should to the speed of the operation more than any simplification, (I should try to remember that,) but some simplifications do arise.

AndrewAPrice wrote:Can you share memory between processes? (Let's say I allocate a class in Process A, can Process B access it and call methods on it?)

Without any memory protection, the answer here is easy: Yes! With simple memory protection the kernel needs to allow it, but only that. There's no need to translate addresses.

AndrewAPrice wrote:How do you determine what process owns a chunk of memory?

Good question. One answer would be to mark them both as owners. Perhaps the original owner tells the kernel to grant co-ownership to another process. Or perhaps the original owner retains ownership, and the other is notified if the original owner exits. I momentarily thought the latter would also require informing the kernel of the connection, but then I thought a process could maintain its own list of processes which need to be notified when it exits. That's probably too loose if a kernel call is just a function call.

AndrewAPrice wrote:If ownership of objects can be moved, could a malicious program keep calling a critical service with a large payload?

Uh.... what does moving ownership have to do with a typical DoS attack?

AndrewAPrice wrote:How do you handle terminating processes? (E.g. Imagine Process A calls Process B which calls Process C, but B terminates while C is still doing something. What will happen when C returns?)

What happens if Process A implements an interface, sends the object to Process B, then A terminates, and B tries to call a virtual method on the interface?

I don't see a difference between these two; both being a case of a depended-upon process terminating. Good call anyway. Looks like it'll need a notification (signalling) system.

AndrewAPrice wrote:What happens if Process A is updated and the exported function's signature changes, but Process B is a commercial closed source application and the developers haven't rebuilt it for the newer API, and it tries to call Process A with the wrong function signature?

This isn't IPC-specific at all, but is in fact the issue which annoys me the most in all of computing. It's everywhere and practically impossible to get rid of! It applies to any dependancy of one package on another, even if the dependency is on a shell script. (Remember prototypes only provide a means to check an interface. Problems with the interface changing occur regardless of how well-checked it is. In fact, they're more annoying if it isn't well-checked.) Here's an excellent article on it: Our Software Dependency Problem. The article talks about risk, but part of the risk with any depency is the cost of its interface changing. (I think the article mentions this, but it's been a while since I read it carefully.) I know APIs have to change sometimes, but with my fatigue issues, I'm inclined to view developers who horse around with their API all the time as little better than malicious hackers anyway.

This includes developers who release without planning well, then have to make a lot of changes. In the past, I've declared, "I've really had enough of my time being wasted by this!" All the same, the problem is hard to get rid of because, no matter how carefully you plan, releasing is necessary to find bugs.

OSDev.org

Designs of microkernels

Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels

Re: Designs of microkernels