Brendan wrote:Hi,
linguofreak wrote:Most of the functions traditionally handled by a monolithic kernel are things that are properly implemented in a library, but for which there is special system-wide data that needs to be maintained. Current architectures make it easy to have one such library with one set of system-wide data (the kernel), but difficult to have multiple libraries with system-wide data that are isolated from each other. Microkernels on such architectures have to kludge around this by turning what should be a function call to a library without changing threads into message passing between different processes.
The opposite is more true: Most of the functions traditionally handled by a monolithic kernel are things that have no system-wide data (e.g. and only have scheduler specific state, device specific state, file system specific state, network stack specific state, etc); and smushing it all together and putting all of it into a single place violates the
principle of least privilege. Function calls (which only work within the same "protection domain") are inadequate; and current architectures make switching between "protection domains" easy.
You misunderstand what I mean by "system-wide". Compare a typical userspace library to a kernel component: The userspace library, when dealing with a call from a given process, only needs access to the data it is keeping for that process. While dealing with that process it can act like no other processes exist. It has no need to protect its data from the process, and operates as part of the process. A kernel component needs to keep data on all processes and have access to it any time the kernel component is called, and cannot have that data be accessible to any process. This is what I mean by "system-wide". Even if kernel components A and B are isolated from each other, component A serving process X has to have access to the same data as component A serving process Y, and likewise for B.
Current hardware typically has a small number of protection domains per address space (typically just two: user and kernel. The most I've ever seen is four on x86 (I think on Vax as well)). Kernel components are isolated from processes by putting the kernel components in kernelspace and the processes in userspace. They keep their data system-wide by making kernelspace be the same in every process. Processes are separated from each other by making userspace different for every process. But there's no way to isolate the kernel components from each other on such hardware except by putting them into completely separate processes themselves. The preferable solution would be for the hardware to support a large number of protection domains per address space, so that each kernel component could be put in its own protection domain without having to switch processes.
On current hardware, calling a driver under a microkernel goes like this:
1. Process crafts message
2. Process makes system call to kernel to send message to driver. Hardware switches from user to kernel protection domain.
3. Kernel switches to driver's address space
4. Kernel makes callback to driver. Hardware switches from kernel to user protection domain
5. Driver interprets message
6. Driver calls appropriate internal function to act on message
7. Once it has results, driver crafts message to return them to process.
8. Driver makes system call to send message to process. Hardware switches from user to kernel protection domain.
9. Kernel switches to process's address space
10. Kernel returns to process
11. Process interprets message
If the driver needs to call another driver to fullfill a request, insert steps 1 through 11 between steps 6 and 7.
Under a monolithic kernel, calling a driver goes like this:
1. Process makes system call to kernel. Hardware switches from user to kernel protection domain
2. Kernel interprets system call arguments, passes call to appropriate driver
3. Driver interprets system call arguments, calls appropriate internal function
4. Once it has results, driver returns to kernel system-call processing code
5. Kernel returns to process. Hardware switches from kernel to user protection domain.
If the driver needs to call another driver or kernel component, we basically repeat step 3.
Under a microkernel on microkernel-friendly hardware, calling a driver goes like this:
1. Process makes system call to driver. Hardware switches from user to driver protection domain.
2. Driver interprets system call arguments, passes call to appropriate internal function (depending on hardware and operating system architecture, it's possible that the driver exposes multiple entry points and the process was able to call the appropriate function directly, in which case this step is not necessary).
3. Once it has results, driver returns to process. Hardware switches from driver to user protection domain.
If the driver needs to call another driver or kernel component, repeat steps 1 through three (substituting "driver 1" and "driver 2" for "process" and "driver") between steps 2 and 3. This potentially makes the microkernel on friendly hardware a bit slower than a monolithic kernel, but by little enough that the microkernel approach becomes worthwhile.
The microkernel on current hardware makes 4 protection domain switches and 2 address domain switches for every driver call, and preserves isolation between drivers.
The monolithic kernel makes 2 protection domain switches for every time a process makes a driver call, no switches when a driver makes a driver call, and does not preserve driver isolation.
The microkernel on friendly hardware makes 2 protection domain switches for every driver call, and preserves driver isolation.