I'm at the point, where I am implementing my fist application interface in my microkernel. Context switching works. Sample Call (via INT) works, too, incl. result value (one DWORD). Generally, we are talking about small values, that can fit on stack/registers, no big data (they should be transfered via SharedMemory).
The Question I have, how to call depending services? With the first level, it's simple:
UserThread: INT 250 --> Kernel Interrupt, takes tss.ESP's kernel stack, exec the chosen function, returns (IRET), back to user Thread. tss.ESP is free for next Thread.
But, what happens, if Kernel/Interrupt Context wants to call a Microservice-Remote Call? The Original User Thread Stack, and the Kernel Stack needs to be protected. And I want to avoid to "create a new thread" for the sub-call.
One idea I have is the following way:
- UserThread A: INT 250
- Kernel Interrupt, takes tss.ESP's kernel stack
- Before saving register states, move to another ESP, so tss.esp is not exclusivly used anymore.
- the new ESP is "on top of the user stack", but page aligned and will be unmapped from the original user space.
- If we need to call a service, we create a new 4k page below (or "on top") of the stack, and map it for the service address space.
- Call Service B.
- Now, if Service needs to call another IPC-Method C (Sub-Service), it acts like a normal User Thread: Int 250, and so on.
The stack would look like this (all 4k aligned, from stack bottom to top):
UserStack
KernelStack
Serivce Stack
KernelStack
Sub-Service-Stack
...
So, I can cleanly unwinding the stacks, and it will simplify where to get a new stack memory (Current ESP+4096 & 0xFFF + map/unmap for Stack protection, using INVLPG to invalidate single page)
This is pure synchronous design, asynchronous is not planned yet.
I didn't started this implementation yet. I want to ask you if you see any problems - or - how to solve this better.
Goal: Fast IPC in a single thread, synchronous (=blocking) with the ability that a service can call sub services (must have feature) without overhead having additional threads. No big messages, only small values (for example a single DWORD).
Fast Single Thread RPC / IPC with sub-calls, how to stack?
Re: Fast Single Thread RPC / IPC with sub-calls, how to stac
Isn't this just the old kernel-stack question? Do you want to have one kernel stack per CPU or one per thread?
If you want to use one kernel stack per CPU then you must never block in kernel mode. To achieve that, you have to actually write down every possible reason for a task to be blocked, and then note that reason correctly in the task control block. In this system, a system call would be executed by saving the task registers on entry to the syscall, then noting down why you blocked the thread (e.g. because you are waiting for answer from another thread) and moving on to schedule a different thread. Then, when the answer comes in, you can unblock the original task, updating the registers as necessary.
See, this way gets really complicated really fast. More sustainable is the second option. In that case you allocate a new stack when you create a new task (the special case of the first task being that it already has a stack from the bootloader), and you only switch out tss.ESP0 when switching tasks. That way, the CPU state can be saved to kernel stack, and you can just wait in your system calls.
If you want to use one kernel stack per CPU then you must never block in kernel mode. To achieve that, you have to actually write down every possible reason for a task to be blocked, and then note that reason correctly in the task control block. In this system, a system call would be executed by saving the task registers on entry to the syscall, then noting down why you blocked the thread (e.g. because you are waiting for answer from another thread) and moving on to schedule a different thread. Then, when the answer comes in, you can unblock the original task, updating the registers as necessary.
See, this way gets really complicated really fast. More sustainable is the second option. In that case you allocate a new stack when you create a new task (the special case of the first task being that it already has a stack from the bootloader), and you only switch out tss.ESP0 when switching tasks. That way, the CPU state can be saved to kernel stack, and you can just wait in your system calls.
Carpe diem!
Re: Fast Single Thread RPC / IPC with sub-calls, how to stac
Hi,
I spend the last day to thought about what you said and what my goal is / how to solve it in the best way.
My goal is/was: Having the "same" user thread for service calls and sub-service calls if the service call itself needs more another services. I wanted to avoid the costs of creating a thread for each request/child-requests. For example: User wants to read from file, asks VFS, asks specific Mounted VFS, asks Block Service, asks PCI-Service, every Service is a separate Userland-proc (just a sample!).
But how more in think about it, the more I think, I should go the correct separate thread for every request way. The reason is, for managing the temporary stack pages with different mapping in another address space (so the sub-requests cannot damage the foreign pages / stack data), and all the different other checks, it's nearly the same costs as simply creating a new thread (or using a thread pool) and suspending/block the current. This will make maintenance much easier. And Thread-Termination would be more easy, if the user process dies.
Threading / context switching is already implemented (thank to Michael for his great explanation).
By the way, I didn't understand what you mean with simply changing tss.esp0. If we assume, that every process/thread, regardless if kernel thread, user thread, or service user thread, should be preemptible (except within the scheduler interrupt), I think it makes no sense to change tss.esp0.
As experiment, I allocate now a separate page(s) for every thread and change the tss.esp0 on switching with the page associated for that thread. It works, but it makes no sense, because: If an Interrupt happens, i needs "always" to switch somewhere other (because of timer, or service needs to be consulted). So, for my understanding, a single tss.esp0 (kernel stack pointer) is enough. Of course, the threads have their own USER stack, and thats never mixed with an kernel stack (so it's protected).
I spend the last day to thought about what you said and what my goal is / how to solve it in the best way.
My goal is/was: Having the "same" user thread for service calls and sub-service calls if the service call itself needs more another services. I wanted to avoid the costs of creating a thread for each request/child-requests. For example: User wants to read from file, asks VFS, asks specific Mounted VFS, asks Block Service, asks PCI-Service, every Service is a separate Userland-proc (just a sample!).
But how more in think about it, the more I think, I should go the correct separate thread for every request way. The reason is, for managing the temporary stack pages with different mapping in another address space (so the sub-requests cannot damage the foreign pages / stack data), and all the different other checks, it's nearly the same costs as simply creating a new thread (or using a thread pool) and suspending/block the current. This will make maintenance much easier. And Thread-Termination would be more easy, if the user process dies.
Threading / context switching is already implemented (thank to Michael for his great explanation).
By the way, I didn't understand what you mean with simply changing tss.esp0. If we assume, that every process/thread, regardless if kernel thread, user thread, or service user thread, should be preemptible (except within the scheduler interrupt), I think it makes no sense to change tss.esp0.
As experiment, I allocate now a separate page(s) for every thread and change the tss.esp0 on switching with the page associated for that thread. It works, but it makes no sense, because: If an Interrupt happens, i needs "always" to switch somewhere other (because of timer, or service needs to be consulted). So, for my understanding, a single tss.esp0 (kernel stack pointer) is enough. Of course, the threads have their own USER stack, and thats never mixed with an kernel stack (so it's protected).