Weird stack problem
Weird stack problem
Hi there. My first post at what seems to be a very knowledgeable forum!
So here's the deal: I have a very simple OS with basic malloc() and roundrobin scheduling using a linked list. It schedules threads represented by a Thread struct, and it all happends in kernel space (no user space yet)
It works pretty well, but after create quiet a few threads, I get a page fault. The thing is: Whatever stack size I choose, the page fault ADDRESS is equal to the stack size + 11.
So if I allocate a 4K stack, I get page faults on 4096+11 = 4107. If I allocate an 8K stack, I get page faults on 8192+11=8203.
Note that I have identity mapped the entire memory, except 0-1 MB.
Any ideas on how to attack this problem?
So here's the deal: I have a very simple OS with basic malloc() and roundrobin scheduling using a linked list. It schedules threads represented by a Thread struct, and it all happends in kernel space (no user space yet)
It works pretty well, but after create quiet a few threads, I get a page fault. The thing is: Whatever stack size I choose, the page fault ADDRESS is equal to the stack size + 11.
So if I allocate a 4K stack, I get page faults on 4096+11 = 4107. If I allocate an 8K stack, I get page faults on 8192+11=8203.
Note that I have identity mapped the entire memory, except 0-1 MB.
Any ideas on how to attack this problem?
Re:Weird stack problem
The +11 sounds like you're using an unaligned stack. The fact that you're above it indicates that you're underflowing the stack.cajunos wrote: Hi there. My first post at what seems to be a very knowledgeable forum!
So here's the deal: I have a very simple OS with basic malloc() and roundrobin scheduling using a linked list. It schedules threads represented by a Thread struct, and it all happends in kernel space (no user space yet)
It works pretty well, but after create quiet a few threads, I get a page fault. The thing is: Whatever stack size I choose, the page fault ADDRESS is equal to the stack size + 11.
So if I allocate a 4K stack, I get page faults on 4096+11 = 4107. If I allocate an 8K stack, I get page faults on 8192+11=8203.
Note that I have identity mapped the entire memory, except 0-1 MB.
Any ideas on how to attack this problem?
Do you have any code so we can offer direct help?
Ideas:
- Bochs debugger trace (of course).
- Map something there. If something is mapped there, figure out why it page faults anyway.
Re:Weird stack problem
Candy, thanks for ther quick reply.
Actually, the stack is perfectly aligned. I verify that by outputting the stack base before running the thread.
I'm using real hardware only so I can unfortunately not provide bochs output.
I'm not sure how to extract the code, since I don't know where to look for the bug. I'll see what I can come up with.
But the overflowing theory might be interesting indeed.
Actually, the stack is perfectly aligned. I verify that by outputting the stack base before running the thread.
I'm using real hardware only so I can unfortunately not provide bochs output.
I'm not sure how to extract the code, since I don't know where to look for the bug. I'll see what I can come up with.
But the overflowing theory might be interesting indeed.
Re:Weird stack problem
No, it's not underflow, I think.
I forgot a CRITICAL piece of information. Not only is the page fault address = page_size +11. The EIP is too, and that's the real problem.
I get the page fault because the EIP is, say, 4107 (0x0x100B), which is waaay below my code.
So the weird part here is really: EIP is = stack size + 11.
I forgot a CRITICAL piece of information. Not only is the page fault address = page_size +11. The EIP is too, and that's the real problem.
I get the page fault because the EIP is, say, 4107 (0x0x100B), which is waaay below my code.
So the weird part here is really: EIP is = stack size + 11.
Re:Weird stack problem
cajunos, if eip is being affected, that sounds particularly like a stack over/underflow. This is the basis for buffer overflows in exploit development.
think about it, a function is called (places EIP on the stack as part of the call instruction). then in that function, some code overruns a buffer which was allocated on the stack and keeps overflowing until it hits the return address. now when that function returns, well it's gonna pop that now invalid value off the stack into EIP!
oops.
proxy
think about it, a function is called (places EIP on the stack as part of the call instruction). then in that function, some code overruns a buffer which was allocated on the stack and keeps overflowing until it hits the return address. now when that function returns, well it's gonna pop that now invalid value off the stack into EIP!
oops.
proxy
Re:Weird stack problem
Would you mind sharing the code that switches tasks and the code that sets them up? That would probably provide the most insight.
Re:Weird stack problem
Unfortunately, I cannot share the code.
Anyway... I think you are right that it is a stack under/overflow. Let me know if you have ideas on debugging approaches for this kind of problems. I am totally lost.
Anyway... I think you are right that it is a stack under/overflow. Let me know if you have ideas on debugging approaches for this kind of problems. I am totally lost.
Re:Weird stack problem
Do you mean you can't share the code, or you don't want to share the code? It is a lot easier to figure out what could be wrong if we have code available. On the other hand, if you're developing this commercially or for a company under contract (or any other situation where you actually are not allowed to share the code), you may have rules that mean asking us for advice is putting yourself on wobbly ground (especially if you told them you could handle this ).
Re:Weird stack problem
Without the slightest hint of code, this is all I can say:
Create a debug command that you can insert calls to at many locations that simply dumps the upper portion (latest pushes) on the stack.
Wait for keyboard presses...And have very verbose debug output
Create a debug command that you can insert calls to at many locations that simply dumps the upper portion (latest pushes) on the stack.
Wait for keyboard presses...And have very verbose debug output
Re:Weird stack problem
Dear cajunos,
Nobody here wants to steal your code, I can reassure you that people that have replied to your question (including myself) have already implemented task switching.
While it is perfectly valid that you dont want to post your code, without posting it the amount of help people can give you is limited - especially when you define your status as "totally lost" and cannot properly point to where the problem might be.
As proxy said, if EIP is messed up it's a direct result of a bad return address from a function call which indicates a stack underflow.
Try to isolate the problem as much as you can and print in the start/end of every function the address of the return address, if just before the page fault you see the return address was modified - you are one step closer to the solution.
Hint: The task switching code ( that actually does the stack switch ) should be compiled with the option to omit the frame pointer reference.
Daniel.
Nobody here wants to steal your code, I can reassure you that people that have replied to your question (including myself) have already implemented task switching.
While it is perfectly valid that you dont want to post your code, without posting it the amount of help people can give you is limited - especially when you define your status as "totally lost" and cannot properly point to where the problem might be.
As proxy said, if EIP is messed up it's a direct result of a bad return address from a function call which indicates a stack underflow.
Try to isolate the problem as much as you can and print in the start/end of every function the address of the return address, if just before the page fault you see the return address was modified - you are one step closer to the solution.
Hint: The task switching code ( that actually does the stack switch ) should be compiled with the option to omit the frame pointer reference.
Daniel.
Re:Weird stack problem
Thanks you for the hint.Trashey wrote:
Hint: The task switching code ( that actually does the stack switch ) should be compiled with the option to omit the frame pointer reference.
The actual thread switching code is assembler, i.e. comiled NASM. That should be fine, right?
That function is called from C, of course, but that shouldn't matter?
Re:Weird stack problem
Actually it DOES matter, since the calling C function saves the frame and stack pointer when it enters, before it calls your assembly inline function.cajunos wrote:
Thanks you for the hint.
The actual thread switching code is assembler, i.e. comiled NASM. That should be fine, right?
That function is called from C, of course, but that shouldn't matter?
When the function leaves it will restore the frame pointer and the stack pointer to what it was BEFORE you called the stack switching code (to what it saved when the function entered, from another thread's stack).
Compile your scheduler with the compiler option to omit the frame pointer.
Re:Weird stack problem
Well, whattayouknow! It solved the problem. I would never-ever have thought of this - at least not on this side of x-mas.
(cajunos kisses the forum)
(cajunos kisses the forum)