OSDev.org

Posted: **Fri Feb 03, 2012 1:13 am**

Hi,

So far, my kernel allocates 10 pages to any new processes or threads. At the time i couldn't easily find a better solution and I figured it was enough. Coming back to this now, I'm wondering how I can handle the allocation for the stack more efficiently.
Basically how can I know when the stack is almost at the page edge and need a new page allocation... and what can i do if the next contiguous page isn't free ?

Any advice on that d be much appreciated.

Posted: **Fri Feb 03, 2012 1:37 am**

One of the way:

The CPU generate page fault when accessing non-present page. You then allocate new pages on the #PF handler.

Posted: **Fri Feb 03, 2012 1:42 am**

stanko51 wrote:Basically how can I know when the stack is almost at the page edge and need a new page allocation...

By checking %esp?

bluemoon's suggestion is better, though.

and what can i do if the next contiguous page isn't free ?

Why would the physical page need to be contiguous?

Posted: **Fri Feb 03, 2012 2:23 am**

Thank you! using PF... This was actually pretty obvious.
I was talking about pages in the virtual address space. But i guess depending on memory organization it is pretty unlikely to have some allocation just under the stack? (is this a correct assumption?)

Posted: **Fri Feb 03, 2012 2:31 am**

You should have well planned the layout of logical address space.
It's also very unlikely you let the stack grow unlimited, you should have some policy.
If the address collide you may terminate the app.

Posted: **Fri Feb 03, 2012 2:43 am**

The classic setup is code, data, heap, [empty], stack. If someone allocates enough (upward-growing) heap to interfere with (downward-growing) stack, tell the user the application sucks and terminate it.

Posted: **Mon Mar 05, 2012 3:37 am**

Thank you.
In the case the process is running several threads. Where do you place the different user stack ? leave empty spaces ?

Posted: **Mon Mar 05, 2012 4:08 am**

Solar wrote:The classic setup is code, data, heap, [empty], stack. If someone allocates enough (upward-growing) heap to interfere with (downward-growing) stack, tell the user the application sucks and terminate it.

Not necessarily. That's what break syscall good for (known as "sbrk"). If it happens, you can move the whole stack upwards, and modify esp in tss. You can implement it without specific syscall too, the trick is you have to check heap_end==stack_end in PF handler before a page is allocated. The app would not notice at all, and continues without a problem.

Posted: **Mon Mar 05, 2012 5:54 am**

turdus wrote:
Solar wrote:The classic setup is code, data, heap, [empty], stack.
That's what break syscall good for (known as "sbrk"). If it happens, you can move the whole stack upwards, and modify esp in tss.

Apparently I didn't make myself clear.

Code: Select all

The Classic Setup
=================

---------- 4 GB
Kernel
---------- 3 GB
v Stack v

^ Heap  ^
Data
Code
---------- 0 GB

The sbrk() call is used to extend the heap upwards. As you might note, the stack starts at the top of addressable memory (thanks to the MMU that makes this possible without wasting any actual memory). Hence, there is nowhere to move the stack to if heap and stack collide.

(Feel free to insert appropriate values for a 64bit system.)

Posted: **Mon Mar 05, 2012 7:24 am**

Perhaps he mean squeeze the stack by swapping some of them out? Not worth to implement but possible to extend the heap by a few KiB

Posted: **Mon Mar 05, 2012 8:04 pm**

that classic setup is pretty clear.

My follow up question is, what the classic setup for different threads sharing same address space would be ?

Code: Select all

---------- 4 GB
Kernel
---------- 3 GB
v Stack  thread 1 v
[empty]
v Stack  thread 2 v
[empty]
v Stack  thread 3 v

^ Heap  ^
Data
Code
---------- 0 GB

??

and if so, how big should empty space be ?

thanks

Posted: **Mon Mar 05, 2012 9:11 pm**

The implementation of threads is relatively new so there is no 'classic' way to do it.

You could use a single 4k guard page as empty space between thread stacks. You can make the stacks big (even MB) in virtual memory but only allocate physical memory as required right up/down to the guard page.

I suggest that you get the main() stack size from the executable file and allow the user an API to set other thread stack sizes. It's difficult for an OS to predict stack size on behalf of an application.

Posted: **Tue Mar 06, 2012 4:53 am**

Solar wrote:(Feel free to insert appropriate values for a 64bit system.)

I see what you mean, and you are right. But for 64bit it's very unlikely to face this situation, it's more probably will run out of physical RAM before heap exceeds the stack

Or, if you put the stack in upper half, you'll get an non-canonical address fault before it could interfere.

stanko51 wrote:that classic setup is pretty clear.

My follow up question is, what the classic setup for different threads sharing same address space would be ?

Threads should not share address space, imho. One good solution is:

Code: Select all

---------- 4 GB
Kernel                           mapped globally
---------- 3 GB
v Stack v                        mapped per thread
[empty]
^ Heap  ^                        mapped per process
Data                             mapped per process
Code                             mapper per process
---------- 0 GB

So threads of the same process will share code and data, but will have their individual stacks. This is what Solaris call light weight process. In this case the kernel aware of threads, and switches among them by invalidating only the stack (opposite to invalidating the full address space on process switch).

Using pthreads is a different kind, there kernel does not know about process' userspace threads, so there's pointless to talk about it's address space layout. They are implemented entirely in userspace, so you put your stacks in heap on a malloc'd area. If you do this, you should use a guard item (Minix uses 0xDEADBEAF for this), and check for it on every push:

Code: Select all

macro mypush a
{
      cmp dword [esp-4], 0xDEADBEEF
      jne @f
      //stack is full
      call expandstack
@@:   push a
}

Note that it's not easy to implement non-present guard page from userspace. You'll need mmap syscall that:
1. allows mapping of non-present page which could be dangerous
2.a. kernel would need to know how userspace malloc works (it must alter it's alloced/free structure) -OR-
2.b. userspace must take care of 3 types of memory: alloced, free, non-present (marking non-present as alloced could cause trouble)
I would go with guard item.

Posted: **Tue Mar 06, 2012 5:24 am**

Turdus wrote:Note that it's not easy to implement non-present guard page from userspace. You'll need mmap syscall that:

I just added a (non-standard) flag to mmap, MAP_GUARDPAGE. It causes the memory manager to not commit a physical page to the lowest page in the mmap'ed area.

Posted: **Tue Mar 06, 2012 9:32 am**

turdus wrote:If you do this, you should use a guard item (Minix uses 0xDEADBEAF for this), and check for it on every push:
Code: Select all
macro mypush a
{
      cmp dword [esp-4], 0xDEADBEEF
      jne @f
      //stack is full
      call expandstack
@@:   push a
}

Interesting. But would it breaks if I manually push 0xDEADBEEF on stack?

Code: Select all

mypush dword 0xDEADBEEF
call foo
add esp, 4
mypush eax ; will this trigger stack full alarm?

The 3rd solution is a structured exception handler, where unresolved #PF have a last chance to survive by passing to user handler for rescue.
you then mmap in the user handler.

OSDev.org

managing stack

managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack

Re: managing stack