Page 1 of 1

Stack management

Posted: Sat Apr 25, 2015 8:26 pm
by darkinsanity
Hey there,
I'm currently refactoring some of my kernel-code, and I came to the conclusion that my stack management is probably not the best. I'm writing a Microkernel, and my original approach was this: For each new thread started, the kernel allocated only a kernel-stack, the userspace was responsible for getting it's own stack. Then I began implementing IPC with popup-threads and figured that I needed a simple way to allocate a userspace-stack for them. So I started to use a bitmap for allocating a stack for the popup-threads.
However, I'm sure that my approach is far from being optimal. Not only does it probably lead to duplicated code, it also prevents me from using kernel-based stack-protection mechanisms. I then wondered how other hobby operating systems handle this, and started to poke around. Some just handed out page-sized chunks, starting from the top of the address space, and didn't care about freeing the memory (which would be especially bad for my kernel, since it uses popup-threads).
So, how would a good stack-management look like? Even if I don't plan to implement stack-protection mechanisms at the moment, I want to have the opportunity to do it later without reimplementing everything. Where would I place the stacks? And what if a thread needs more than a single page?

Another problem that bothers me is, when and how do I clean up threads? When a thread exits, the kernel uses the kernel-stack of the thread, so I can't immediately free it. But when do I do it, then? Should I let the scheduler check if there are threads that can be cleaned up, or do I add them to a list so the idle-thread does the cleanup (and what would happen if the CPU is kept busy)? Or should I switch to a one-stack-per-CPU-approach?

I'm interested in how other OSes handle these things, but the really small ones often don't handle this very well, and the big ones (Linux etc) are too complicated to understand without investing hours. If you have solved these problems in your own OS (or in one you know), I'd be thankfull for a link to the code.

Re: Stack management

Posted: Sat Apr 25, 2015 9:04 pm
by gerryg400
When my kernel allocates a user stack for a thread it calls the same function in my kernel that mmap eventually calls. In other words the allocated memory looks to the thread just like any memory that it had allocated itself. My mmap has an option MAP_GUARD to put a guard page at the bottom of the stack. The kernel uses his when it allocates a stack for the thread.

When a thread exits the kernel calls munmap on the region that it allocated for the stack.

I use the 'one kernel stack per core method' so disposing of kernel stacks is not an issue I need to deal with.

Re: Stack management

Posted: Sat Apr 25, 2015 10:42 pm
by Brendan
Hi,
darkinsanity wrote:Another problem that bothers me is, when and how do I clean up threads? When a thread exits, the kernel uses the kernel-stack of the thread, so I can't immediately free it. But when do I do it, then? Should I let the scheduler check if there are threads that can be cleaned up, or do I add them to a list so the idle-thread does the cleanup (and what would happen if the CPU is kept busy)?
I normally have a FIFO queue of "thread data blocks" (which includes the thread's kernel stack) for threads that have been terminated but not completely cleaned up yet. This solved a few problems. For my messaging, messages are sent to a thread, and if the same thread ID is re-used too quickly then software might not know the old thread was terminated and accidentally send messages to the new thread. The FIFO queue prevents that by ensuring the same thread ID isn't re-used too quickly. The other thing it helped with is resource usage. When a thread is being spawned the kernel checks that FIFO queue of "thread data blocks", and if the oldest one is old enough it's recycled (which helps performance a little because it's already allocated and mapped into kernel space). Finally; when the kernel is running low on free physical pages it goes looking for pages it can free; which includes checking the FIFO queue of "thread data blocks" and freeing them if they're old enough.
darkinsanity wrote:Or should I switch to a one-stack-per-CPU-approach?
Switching from one kernel stack per thread to one kernel stack per CPU involves changing semantics that a lot of the micro-kernel's existing code may rely on. You'll have to weigh up the advantages and disadvantages and decide if it's something you want to do.

To help you make that decision, the advantages are:
  • Less memory usage
  • Slightly faster spawning and terminating threads
  • Better cache locality for kernel (e.g. less cache and TLB misses when kernel accesses its stack after a thread switch)
And the disadvantages:
  • No easy way to have "kernel threads" that do things like cleaning up kernel data and/or doing work before it's needed (pre-fetching, pre-calculating, whatever) when there's nothing more important to do.
  • When the kernel is in the middle of doing something less important; there's no way to stop and switch to something more important (e.g. creating a new lower priority process when a high priority IRQ occurs). This means that to avoid potential latency problems you might end up splitting up the more expensive operations into multiple small pieces so that you've got a chance to switch to more important things in-between those smaller pieces. For example; rather than creating a new process in one go; you might create some sort of "process data block", check if there's something more important to switch to, create a new/empty virtual address space for the process, check if there's something more important to switch to, then create a new "thread data block" for the process' initial thread.
Note: For both of the "disadvantages" above; I'd seriously consider splitting things up into "tasklettes" (little pieces of work the kernel needs to do - e.g. "handle system call from thread 0x1234", "create virtual address space for new process 0x4321", etc); and giving each tasklette a priority and having a (per CPU?) list of tasklettes for each priority. When anything happens (IRQ, kernel syscall, etc) you'd put a tasklette on the appropriate list. The kernel would do whichever tasklettes are higher priority than the current highest priority (user-space) thread before returning to the current highest priority (user-space) thread. It's like having a miniature "tasklette scheduler". Of course you'd want to make sure that creating and freeing tasklettes is very fast, and if its faster to do something immediately instead of creating a tasklette you'd do it immediately.


Cheers,

Brendan

Re: Stack management

Posted: Mon Apr 27, 2015 4:32 am
by Velko
darkinsanity wrote:Another problem that bothers me is, when and how do I clean up threads? When a thread exits, the kernel uses the kernel-stack of the thread, so I can't immediately free it. But when do I do it, then?
My approach is to put thread into PENDING_CLEANUP state and switch away from it, like if thread's time-slice was used up. But, as long as scheduling algorithm is concerned, PENDING_CLEANUP is no different from RUNNABLE.

Eventually scheduler selects the thread again, but instead of switching to it, cleanup is performed.

Re: Stack management

Posted: Tue May 05, 2015 8:16 pm
by darkinsanity
gerryg400 wrote:When my kernel allocates a user stack for a thread it calls the same function in my kernel that mmap eventually calls. In other words the allocated memory looks to the thread just like any memory that it had allocated itself. My mmap has an option MAP_GUARD to put a guard page at the bottom of the stack. The kernel uses his when it allocates a stack for the thread.
That sounds reasonable. But how does your function decide where to place the stack? At the moment, my kernel has no infrastructure for managing userspace memory besides pagetables.
I'm currently thinking about using a doubly-linked list of structs describing memory areas (probably using page-granularity). When memory is allocated (via malloc or for a stack), I'd simply search the list for a free place and add it there. When allocating a stack, I'd search the list backwards, to keep the stacks away from heap allocations.
Implementation should be fairly easy, but I'm worried about efficiency, especially because I'm using popup-threads and thus need fast thread-spawning and stack allocation.

Re: Stack management

Posted: Tue May 05, 2015 11:57 pm
by iansjack
Why not allocate a page (or pages) for your user stack and map it to the same address for each process? You can always grow the stack, via a page fault, if needs be.

Re: Stack management

Posted: Wed May 06, 2015 1:36 pm
by gerryg400
darkinsanity wrote:
gerryg400 wrote:When my kernel allocates a user stack for a thread it calls the same function in my kernel that mmap eventually calls. In other words the allocated memory looks to the thread just like any memory that it had allocated itself. My mmap has an option MAP_GUARD to put a guard page at the bottom of the stack. The kernel uses his when it allocates a stack for the thread.
That sounds reasonable. But how does your function decide where to place the stack? At the moment, my kernel has no infrastructure for managing userspace memory besides pagetables.
I'm currently thinking about using a doubly-linked list of structs describing memory areas (probably using page-granularity). When memory is allocated (via malloc or for a stack), I'd simply search the list for a free place and add it there. When allocating a stack, I'd search the list backwards, to keep the stacks away from heap allocations.
Implementation should be fairly easy, but I'm worried about efficiency, especially because I'm using popup-threads and thus need fast thread-spawning and stack allocation.
I use that method. I have a doubly linked list of mapping structures. The mm just finds a space and inserts a new structure. I don't bother with going backwards since there will be multiple threads in the process although it might help keep the heap contiguous. I'm also not too worried about speed at this point.

Later if you find that speed is an issue you could keep a small cache of stacks available in each process and only release stacks if they are not re-used quickly.