How exactly does the stack work?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

How exactly does the stack work?

Post by Candy »

I'm wondering about an off-by-one I'm almost for sure committing. Suppose:

real mode, stack segment = 0x0000, stack pointer = 0x8000

If I pushed a 32-bit value, the stack pointer would go to 0x7FFC. Would the byte at 0x8000 exact be overwritten or not?

If not, how would I go about making a 65536-byte stack? Using 0x0 for the stack pointer seems awkward

TIA for any info, Candy
Therx

Re:How exactly does the stack work?

Post by Therx »

No when the processor uses the stack the byte at the SP/ESP value remains intact.

When the CPU pushes to stack:-

Code: Select all

function push(value)
{
   SP = SP - sizeof(value);
   *SP = value;
}
If not, how would I go about making a 65536-byte stack? Using 0x0 for the stack pointer seems awkward
Don't know but really I'd use 32bit mode you're going to run out of memory very quickly as 1mb is quite small anyway and the memory below 1mb is full of holes.

Pete
Tim

Re:How exactly does the stack work?

Post by Tim »

Using an initial value of 0 for SP in real mode is fine. On first glance it looks wrong, but remember SP wraps round to the top of the segment the first time you push anything.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:How exactly does the stack work?

Post by Candy »

Wouldn't you get a stack overflow (kinda forgot how that worked...) by setting SP to 0?

Stack overflowing in PM works by checking for a page fault in the stack region right? any specific info I missed?

Oh, and another one, if I'm in protected mode, just loaded a task, allocated some pages, can I set its ESP to 0 so it'll start from 0xFFFFFFFC with the first word?

Pete: I am using 32-bit code in the kernel, was just wondering about the OS loader module, and how to cram it all in the first 64k (which I've now succeeded in :)).
Tim

Re:How exactly does the stack work?

Post by Tim »

I thought we were talking real mode? I'm not sure how this would work in protected mode; whether you'd get a stack fault as ESP went below zero.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:How exactly does the stack work?

Post by Candy »

New idea about protected mode:

I'm thinking about using a single TSS for scratch space for the cpu. CR3 loading and stuff all software-methodics, but I don't know if I got this part correctly:

Using one TSS for scratch space (OS-wide...)

When a task is running, and an interrupt occurs sending it to the scheduler in cpl0, the processor stores all registers (eax - edi) on the stack for cpl3, switches to the cpl0 stack (from the TSS) and stores the return info (cs, eip, ss, esp) for cpl3 there. Then it enters the scheduler, which swaps the stack for the kernel stack. The kernel goes to the task choose module, switches the CR3 to that of the new tasks (thus letting the current task info change to the new tasks' info, but not changing the OS code (globally mapped)). The kernel then switches to the stack of a different process. Upon doing an iret with cpl change, the cpl0 stack is stored in the TSS, the cpl3 info is loaded from the cpl0 stack, and the registers are loaded from the cpl3 stack, after which execution continues.

Starting a process: Preloading the cpl3 stack of the process with zeroes to go in the current values, preloading the cpl0 stack with the return variables that would be put there by an interrupt, assinging a cr3 etc...

Did I make a mistake anywhere? I feel like I did something wrong with this logic... it seems too simple...
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:How exactly does the stack work?

Post by Pype.Clicker »

Candy wrote: New idea about protected mode:

I'm thinking about using a single TSS for scratch space for the cpu. CR3 loading and stuff all software-methodics, but I don't know if I got this part correctly:

Using one TSS for scratch space (OS-wide...)

When a task is running, and an interrupt occurs sending it to the scheduler in cpl0, the processor stores all registers (eax - edi) on the stack for cpl3, switches to the cpl0 stack (from the TSS) and stores the return info (cs, eip, ss, esp) for cpl3 there.
hmm. i'm unsure on how you plan to save eax-edi on the user stack, but this is in no way something you can do easily with an interrupt handler -- except if you check by software the SS:ESP value that has been saved by the CPU on your DPL0 stack and manually store the register values there ... however, unless you also deroute the instruction stream to a user-level "popa" instruction ...
Then it enters the scheduler, which swaps the stack for the kernel stack.
uuh ? well, you were already on a CPL0 stack, so what's the plan for an additionnal "kernel stack" here ? can't the CPL0 stack of each process be the "kernel stack" ?
The kernel goes to the task choose module, switches the CR3 to that of the new tasks (thus letting the current task info change to the new tasks' info, but not changing the OS code (globally mapped)). The kernel then switches to the stack of a different process.
or maybe you expect the kernel stack to be available in every process while CPL0 process-specific stacks are not available globally ... hmm ... but why not just assuming that you will not access the stack between the stack-switch code (between P1.dpl0_stack and P2.dpl0_stack) and CR3 reprogramming ?
Upon doing an iret with cpl change, the cpl0 stack is stored in the TSS, the cpl3 info is loaded from the cpl0 stack, and the registers are loaded from the cpl3 stack, after which execution continues.
watch out: there's nothing like an automatic restore of TSS.ESP0 or TSS.SS0 by the computer: these fields are *static*, which means the CPU reads them at interrupt handling, but it never writes to them automatically.
distantvoices
Member
Member
Posts: 1600
Joined: Wed Oct 18, 2006 11:59 am
Location: Vienna/Austria
Contact:

Re:How exactly does the stack work?

Post by distantvoices »

do you mean software task switching by means of switching stacks?

oh sweet jesus, this stack switching dark magic is something you have to do in the ISR prologue/epilogue. There are some tutorials at www.osdever.net which should explain this with code examples.

And you *have* to update esp0/ss0 in the global tss for each task for that the processor knows which esp0 to take in case of an interrupt.

also this *switching-to-kernel-stack* has to happen in the isr-prologue, right after saving all registers to the stack0 of the interrupted process/saving the esp0 value in some field for this process (a process structure).

Do you have some good books about os dev at hands?

stay safe...
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:How exactly does the stack work?

Post by Pype.Clicker »

Candy wrote: Q: Wouldn't you get a stack overflow (kinda forgot how that worked...) by setting SP to 0?

Stack overflowing in PM works by checking for a page fault in the stack region right? any specific info I missed?
Setting ESP to 0 is usually a bad manner in protected mode. It is very unlikely that your stack segment is 4GB wide *and* that address 0xFFFF.FFFE is the valid top of your stack.

Stack overflow can be detected by a "guard page", but you can also use the segmentation protection by using an expand-down stack segment which limit will be computed so that the "valid stack addresses" start at the limit (and go up to 0xFFFFFFF) ...

The advantage of this technique over pure paging protection is that you still can handle a stack overflow for the kernel stack if you dedicate a TSS to the handling of Stack Fault exception.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re:How exactly does the stack work?

Post by Solar »

Candy wrote: If I pushed a 32-bit value, the stack pointer would go to 0x7FFC. Would the byte at 0x8000 exact be overwritten or not?
This thread already has ventured beyond the initial question, but I usually try to explain "why" along with "what" because it makes understanding so much easier:

If your stack starts at 0x8000, and you push a 32-bit value, you'd expect that value to be longword-aligned, wouldn't you?

That's "why" 0x8000 isn't used...
Every good solution is obvious once you've found it.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:How exactly does the stack work?

Post by Candy »

hmm. i'm unsure on how you plan to save eax-edi on the user stack, but this is in no way something you can do easily with an interrupt handler -- except if you check by software the SS:ESP value that has been saved by the CPU on your DPL0 stack and manually store the register values there ... however, unless you also deroute the instruction stream to a user-level "popa" instruction ...
I was planning on using the cpl0 stack of the process for the eax-edi (or rax - r15) values of the process, by starting with a pushad-like function (different something for amd64 since they ripped out pushad...) Either the system pushes them automatically on interrupt (such as in real mode, which I expected) or it does not, upon which the ones from the current stack are used.

Returning to the function would be like popa; iret.
uuh ? well, you were already on a CPL0 stack, so what's the plan for an additionnal "kernel stack" here ? can't the CPL0 stack of each process be the "kernel stack" ?
The idea about having a different cpl0 stack for each process is so that each process functions completely independantly from the other processes, and most importantly, the kernel does not suffer from whatever the user process might or might not do.
or maybe you expect the kernel stack to be available in every process while CPL0 process-specific stacks are not available globally ... hmm ... but why not just assuming that you will not access the stack between the stack-switch code (between P1.dpl0_stack and P2.dpl0_stack) and CR3 reprogramming ?
because that would go straight against my design philosophy, mechanism separate from policy, switching the task itself is policy and therefore a separate function should be called from that space (thereby necessitating a stack) for the CR3 switch. Also, for code structure, I would like to use the kernel stack for all kernel interrupts, this one being one of the primary ones. This guarantees for me that the stack is safe from anything related to the user process. Might just be superstition but I think it's necessary :D

watch out: there's nothing like an automatic restore of TSS.ESP0 or TSS.SS0 by the computer: these fields are *static*, which means the CPU reads them at interrupt handling, but it never writes to them automatically.
okay... didn't notice that... still, don't think they (need to) change actually... if all's well the stack will only be used for the kernel functions called by the current execution context, and everytime it returns the stack will be identical..

part 2 follows
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:How exactly does the stack work?

Post by Candy »

oh sweet jesus, this stack switching dark magic is something you have to do in the ISR prologue/epilogue. There are some tutorials at www.osdever.net which should explain this with code examples.

Code: Select all

void sched__switch_wrap() {
   // switch to scheduler stack
   asm ("movl %%esp, %0; movl %1, %%esp": "=c"(ct->esp) : "a"(kernel_t.esp));
   ct->status = SCHED_SUSP;
...
   // switch stack back
   asm ("movl %%esp, %0; movl %1, %%esp": "=c"(kernel_t.esp) : "a"(ct->esp));

   sti();

   if (ct->status == SCHED_SUSP) {
      ct->status = SCHED_RUNNING;
      asm ("addl $8, %esp; popl %ebp; iretl");
works for me :)
And you *have* to update esp0/ss0 in the global tss for each task for that the processor knows which esp0 to take in case of an interrupt.
Okay, didn't know that, will remember that when starting programming of code that starts cpl3 tasks...
also this *switching-to-kernel-stack* has to happen in the isr-prologue, right after saving all registers to the stack0 of the interrupted process/saving the esp0 value in some field for this process (a process structure).
k... didn't save the registers yet... *slaps head* thx
Do you have some good books about os dev at hands?
The indispensable PC hardware book 3rd + 4th edition, Modern Operating Systems (Andy), Operating systems backgrounds mechanism & design (stallings) and the series of processor books from AMD (paperback! :))

Setting ESP to 0 is usually a bad manner in protected mode. It is very unlikely that your stack segment is 4GB wide *and* that address 0xFFFF.FFFE is the valid top of your stack.
Stack overflow can be detected by a "guard page", but you can also use the segmentation protection by using an expand-down stack segment which limit will be computed so that the "valid stack addresses" start at the limit (and go up to 0xFFFFFFF) ...
I was planning on using paging, and no segmentation, because of the second platform I want to support (AMD64) which does not support segmentation. (and possible the third, IA64 which has no support either) Stack underflow will be detected by an unmapped page, stack overflow by dynamic calculation of the stack size, and by not mapping in another page :) (paging code)
The advantage of this technique over pure paging wwwprotection is that you still can handle a stack overflow for the kernel stack if you dedicate a TSS to the handling of Stack Fault exception.
Would be nice, but not using them allows easy porting to the other platforms. Again, I don't want to use a TSS for anything (anymore...), but using a separate stack for the stack fault exception would be an option. Having the same effect and size of the TSS option, it just doesn't matter :)
Post Reply