I'd like to talk about a form of IPC which doesn't seem to be used by any OS yet*. I dub it "process hopping". It is aimed at micro- & hybrid kernels.
Instead of having a process with one or more local threads which block/yield on IPC, a thread initiating a request instead gets moved ("hops") to the receiving process. It then runs a handler in that process and "hops" back to the sending process when it has fetched the data it needs.
A (very) simplified visualization of an IPC request
A more complex scenario would be one where data needs to be fetched from an external source, which may take many milliseconds. A thread can exit when there is nothing else to be done. When the data is ready, the device triggers an interrupt and the kernel creates a new thread. This thread is then passed to the server process.
A simplified visualization of an IPC request involving data from an external source
The obvious advantages of this approach are:
- Trivial soft real-time communication, since threads immediately begin processing whatever was requested.
- Simpler scheduler implementation, since there is no concept of a blocked or sleeping thread. Instead, threads are created on demand.
- Lower memory usage, since threads that aren't running simply don't exist. The process still needs to save the registers it uses but it will use stack space that is likely allocated anyways.
- Restoring the state of the requesting thread can potentially be complex. I think it can be fairly simple with some kernel support but I'm not certain.
- There may be a risk of having too many concurrent threads processing requests since that's the encouraged pattern, degrading performance. Making threads exit when waiting on some external source should prevent this but again, I'm not 100% certain.
- This approach will only be efficient if creating threads is cheap. However, I do believe that it is possible to make threads very cheap to create.
- Context switching can be very expensive. Then again, AFAICT the majority of modern CPUs has support for ASIDs/PCIDs so hopefully this overhead is minimal.
* Perhaps someone did already implement it but uses a fancy name for it that I didn't think of. My search-fu is failing me.