Software based task switching - Patching esp0

Combuster · Post by **Combuster** » Mon Jan 23, 2012 7:47 am

bluemoon wrote:Consider you have 5000 threads and each has a 4K kernel stack, it use up 20MB, non-swappable memory for virtually nothing.

Common practice is to store the thread state inside a dedicated kernel stack upon task switch. Which means you will never ever lose 4k per process, but only the part of the stack that was not in use at the time of the switch. If you use the kernel stack for kernel preemption there's essentially two contexts on each stack.

In other words, kernel stacks lose the most when the task in question is CPU bound, in which case you shouldn't have more threads than CPU cores.

Then there's all the other resources each thread uses; I have per-thread kernel stacks yet I only have one page of memory in use to maintain all thread control. If you share kernel stacks, you lose the option of preemptible kernels and all the info you need to store to cover for that takes up memory elsewhere.

KISS, anyone?

egos · Post by **egos** » Mon Jan 23, 2012 9:16 am

raghuk wrote:I didn't understand that. Is there an advantage in storing the thread structures in kernel stack rather than storing them in kernel heap? Right now I kmalloc() space required for thread structures and free them when thread terminates. So I just need one pointer to the current running thread and another pointer to a queue of ready threads.

The one pointer is enough. As I said above you can use TSS.SP0 as a pointer to current thread and keep it in run queue all time while it's active/ready. Just put thread structure at upper part of stack region. For example here is my structure of kernel stack region:
- thread structure;
- kernel stack (including reserve for interrupt handling);
- guard page.

I understand option 1 will perform better because each thread switch does not involve invalidating PTEs/PDEs. But if I switch to a different process, I need to load a new value into cr3 anyway.

With option 1 all kernel stacks are placed in kernel space that is global so kernel stack entries could be stayed in TLB for next time when thread will be activated.

Brendan · Post by **Brendan** » Mon Jan 23, 2012 10:44 am

Hi,

Combuster wrote:Common practice is to store the thread state inside a dedicated kernel stack upon task switch. Which means you will never ever lose 4k per process, but only the part of the stack that was not in use at the time of the switch. If you use the kernel stack for kernel preemption there's essentially two contexts on each stack.

Common practice is to use the kernel stack to store general purpose registers, and some sort of "thread data block" to store things like FPU/MMX/SSE state and various other pieces of information about thread (thread name, time used, signal mask, message queue head and tail, etc). The (user space) general purpose registers add up to about 32 bytes (for protected mode) or 128 bytes (for long mode). FPU/MMX/SSE state is 512 bytes and by the time you allow for the rest of the stuff you can round it up 1 KiB, and pack the 32/128 bytes of general purpose stuff in there too for "free" (instead of padding for alignment).

How big would a thread's kernel stack need to be? Let's assume worst case the kernel could use 2 KiB of stack itself, but IRQ handlers use 512 bytes each on top of that, and maybe you could have up to 12 IRQs nested. That works out to around 8 KiB of kernel stack per thread (to handle the worst case). Obviously this will be a problem (e.g. close to 40 MiB of RAM wasted for 5000 threads), so your next step is going to be having separate stacks for IRQ handlers, to try to get the amount of RAM wasted down (e.g. 4 KiB per thread for kernel stacks, plus a 4 KiB stack per CPU for IRQ handling). This should all sound strangely familiar to people who know about CONFIG_4KSTACKS in Linux.

Combuster wrote:Then there's all the other resources each thread uses; I have per-thread kernel stacks yet I only have one page of memory in use to maintain all thread control. If you share kernel stacks, you lose the option of preemptible kernels and all the info you need to store to cover for that takes up memory elsewhere.

I've been thinking of preemptible kernels. A system of "preemption points" is easy enough and could only cost 2 integers per thread (a pointer of a kernel routine to call and 1 piece of data to pass to that kernel routine) in the thread's "thread data area". Lengthy operations (e.g. creating a process or creating a thread) would be split up into small pieces by these preemption points; and when the kernel reaches one it'd check if it should continue or allow the operation to be preempted (and set those 2 integers to allow the operation to continue later). Later, if the kernel is about to pass control back to the thread it'd check these values, and if they're set call the kernel routine (to continue the preempted operation) instead of passing control back to the thread. It doesn't sound too hard, or too expensive. It's also probably just as preemptable as a "fully preemptable" kernel where task switches are typically disabled in critical sections anyway.

Cheers,

Brendan

raghuk · Post by **raghuk** » Mon Jan 23, 2012 11:55 am

egos wrote:With option 1 all kernel stacks are placed in kernel space that is global so kernel stack entries could be stayed in TLB for next time when thread will be activated.

Ah, I see your point. I intend to support CPUs older than Pentium Pro so I hadn't paid much attention to global pages. But in any case, option 1 will have an advantage because all threads in the same process will have the exact same virtual address space. And I can conditionally enable global pages on P6 and above too.

Thanks,
Raghu

OSDev.org

Software based task switching - Patching esp0

Re: Software based task switching - Patching esp0

Re: Software based task switching - Patching esp0

Re: Software based task switching - Patching esp0

Re: Software based task switching - Patching esp0