Hi,
Combuster wrote:Common practice is to store the thread state inside a dedicated kernel stack upon task switch. Which means you will never ever lose 4k per process, but only the part of the stack that was not in use at the time of the switch. If you use the kernel stack for kernel preemption there's essentially two contexts on each stack.
Common practice is to use the kernel stack to store general purpose registers, and some sort of "thread data block" to store things like FPU/MMX/SSE state and various other pieces of information about thread (thread name, time used, signal mask, message queue head and tail, etc). The (user space) general purpose registers add up to about 32 bytes (for protected mode) or 128 bytes (for long mode). FPU/MMX/SSE state is 512 bytes and by the time you allow for the rest of the stuff you can round it up 1 KiB, and pack the 32/128 bytes of general purpose stuff in there too for "free" (instead of padding for alignment).
How big would a thread's kernel stack need to be? Let's assume worst case the kernel could use 2 KiB of stack itself, but IRQ handlers use 512 bytes each on top of that, and maybe you could have up to 12 IRQs nested. That works out to around 8 KiB of kernel stack per thread (to handle the worst case). Obviously this will be a problem (e.g. close to 40 MiB of RAM wasted for 5000 threads), so your next step is going to be having separate stacks for IRQ handlers, to try to get the amount of RAM wasted down (e.g. 4 KiB per thread for kernel stacks, plus a 4 KiB stack per CPU for IRQ handling). This should all sound strangely familiar to people who know about
CONFIG_4KSTACKS in Linux.
Combuster wrote:Then there's all the other resources each thread uses; I have per-thread kernel stacks yet I only have one page of memory in use to maintain all thread control. If you share kernel stacks, you lose the option of preemptible kernels and all the info you need to store to cover for that takes up memory elsewhere.
I've been thinking of preemptible kernels. A system of "preemption points" is easy enough and could only cost 2 integers per thread (a pointer of a kernel routine to call and 1 piece of data to pass to that kernel routine) in the thread's "thread data area". Lengthy operations (e.g. creating a process or creating a thread) would be split up into small pieces by these preemption points; and when the kernel reaches one it'd check if it should continue or allow the operation to be preempted (and set those 2 integers to allow the operation to continue later). Later, if the kernel is about to pass control back to the thread it'd check these values, and if they're set call the kernel routine (to continue the preempted operation) instead of passing control back to the thread. It doesn't sound too hard, or too expensive. It's also probably just as preemptable as a "fully preemptable" kernel where task switches are typically disabled in critical sections anyway.
Cheers,
Brendan