How exactly does the stack work?
How exactly does the stack work?
I'm wondering about an off-by-one I'm almost for sure committing. Suppose:
real mode, stack segment = 0x0000, stack pointer = 0x8000
If I pushed a 32-bit value, the stack pointer would go to 0x7FFC. Would the byte at 0x8000 exact be overwritten or not?
If not, how would I go about making a 65536-byte stack? Using 0x0 for the stack pointer seems awkward
TIA for any info, Candy
real mode, stack segment = 0x0000, stack pointer = 0x8000
If I pushed a 32-bit value, the stack pointer would go to 0x7FFC. Would the byte at 0x8000 exact be overwritten or not?
If not, how would I go about making a 65536-byte stack? Using 0x0 for the stack pointer seems awkward
TIA for any info, Candy
Re:How exactly does the stack work?
No when the processor uses the stack the byte at the SP/ESP value remains intact.
When the CPU pushes to stack:-
Pete
When the CPU pushes to stack:-
Code: Select all
function push(value)
{
SP = SP - sizeof(value);
*SP = value;
}
Don't know but really I'd use 32bit mode you're going to run out of memory very quickly as 1mb is quite small anyway and the memory below 1mb is full of holes.If not, how would I go about making a 65536-byte stack? Using 0x0 for the stack pointer seems awkward
Pete
Re:How exactly does the stack work?
Using an initial value of 0 for SP in real mode is fine. On first glance it looks wrong, but remember SP wraps round to the top of the segment the first time you push anything.
Re:How exactly does the stack work?
Wouldn't you get a stack overflow (kinda forgot how that worked...) by setting SP to 0?
Stack overflowing in PM works by checking for a page fault in the stack region right? any specific info I missed?
Oh, and another one, if I'm in protected mode, just loaded a task, allocated some pages, can I set its ESP to 0 so it'll start from 0xFFFFFFFC with the first word?
Pete: I am using 32-bit code in the kernel, was just wondering about the OS loader module, and how to cram it all in the first 64k (which I've now succeeded in ).
Stack overflowing in PM works by checking for a page fault in the stack region right? any specific info I missed?
Oh, and another one, if I'm in protected mode, just loaded a task, allocated some pages, can I set its ESP to 0 so it'll start from 0xFFFFFFFC with the first word?
Pete: I am using 32-bit code in the kernel, was just wondering about the OS loader module, and how to cram it all in the first 64k (which I've now succeeded in ).
Re:How exactly does the stack work?
I thought we were talking real mode? I'm not sure how this would work in protected mode; whether you'd get a stack fault as ESP went below zero.
Re:How exactly does the stack work?
New idea about protected mode:
I'm thinking about using a single TSS for scratch space for the cpu. CR3 loading and stuff all software-methodics, but I don't know if I got this part correctly:
Using one TSS for scratch space (OS-wide...)
When a task is running, and an interrupt occurs sending it to the scheduler in cpl0, the processor stores all registers (eax - edi) on the stack for cpl3, switches to the cpl0 stack (from the TSS) and stores the return info (cs, eip, ss, esp) for cpl3 there. Then it enters the scheduler, which swaps the stack for the kernel stack. The kernel goes to the task choose module, switches the CR3 to that of the new tasks (thus letting the current task info change to the new tasks' info, but not changing the OS code (globally mapped)). The kernel then switches to the stack of a different process. Upon doing an iret with cpl change, the cpl0 stack is stored in the TSS, the cpl3 info is loaded from the cpl0 stack, and the registers are loaded from the cpl3 stack, after which execution continues.
Starting a process: Preloading the cpl3 stack of the process with zeroes to go in the current values, preloading the cpl0 stack with the return variables that would be put there by an interrupt, assinging a cr3 etc...
Did I make a mistake anywhere? I feel like I did something wrong with this logic... it seems too simple...
I'm thinking about using a single TSS for scratch space for the cpu. CR3 loading and stuff all software-methodics, but I don't know if I got this part correctly:
Using one TSS for scratch space (OS-wide...)
When a task is running, and an interrupt occurs sending it to the scheduler in cpl0, the processor stores all registers (eax - edi) on the stack for cpl3, switches to the cpl0 stack (from the TSS) and stores the return info (cs, eip, ss, esp) for cpl3 there. Then it enters the scheduler, which swaps the stack for the kernel stack. The kernel goes to the task choose module, switches the CR3 to that of the new tasks (thus letting the current task info change to the new tasks' info, but not changing the OS code (globally mapped)). The kernel then switches to the stack of a different process. Upon doing an iret with cpl change, the cpl0 stack is stored in the TSS, the cpl3 info is loaded from the cpl0 stack, and the registers are loaded from the cpl3 stack, after which execution continues.
Starting a process: Preloading the cpl3 stack of the process with zeroes to go in the current values, preloading the cpl0 stack with the return variables that would be put there by an interrupt, assinging a cr3 etc...
Did I make a mistake anywhere? I feel like I did something wrong with this logic... it seems too simple...
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:How exactly does the stack work?
hmm. i'm unsure on how you plan to save eax-edi on the user stack, but this is in no way something you can do easily with an interrupt handler -- except if you check by software the SS:ESP value that has been saved by the CPU on your DPL0 stack and manually store the register values there ... however, unless you also deroute the instruction stream to a user-level "popa" instruction ...Candy wrote: New idea about protected mode:
I'm thinking about using a single TSS for scratch space for the cpu. CR3 loading and stuff all software-methodics, but I don't know if I got this part correctly:
Using one TSS for scratch space (OS-wide...)
When a task is running, and an interrupt occurs sending it to the scheduler in cpl0, the processor stores all registers (eax - edi) on the stack for cpl3, switches to the cpl0 stack (from the TSS) and stores the return info (cs, eip, ss, esp) for cpl3 there.
uuh ? well, you were already on a CPL0 stack, so what's the plan for an additionnal "kernel stack" here ? can't the CPL0 stack of each process be the "kernel stack" ?Then it enters the scheduler, which swaps the stack for the kernel stack.
or maybe you expect the kernel stack to be available in every process while CPL0 process-specific stacks are not available globally ... hmm ... but why not just assuming that you will not access the stack between the stack-switch code (between P1.dpl0_stack and P2.dpl0_stack) and CR3 reprogramming ?The kernel goes to the task choose module, switches the CR3 to that of the new tasks (thus letting the current task info change to the new tasks' info, but not changing the OS code (globally mapped)). The kernel then switches to the stack of a different process.
watch out: there's nothing like an automatic restore of TSS.ESP0 or TSS.SS0 by the computer: these fields are *static*, which means the CPU reads them at interrupt handling, but it never writes to them automatically.Upon doing an iret with cpl change, the cpl0 stack is stored in the TSS, the cpl3 info is loaded from the cpl0 stack, and the registers are loaded from the cpl3 stack, after which execution continues.
-
- Member
- Posts: 1600
- Joined: Wed Oct 18, 2006 11:59 am
- Location: Vienna/Austria
- Contact:
Re:How exactly does the stack work?
do you mean software task switching by means of switching stacks?
oh sweet jesus, this stack switching dark magic is something you have to do in the ISR prologue/epilogue. There are some tutorials at www.osdever.net which should explain this with code examples.
And you *have* to update esp0/ss0 in the global tss for each task for that the processor knows which esp0 to take in case of an interrupt.
also this *switching-to-kernel-stack* has to happen in the isr-prologue, right after saving all registers to the stack0 of the interrupted process/saving the esp0 value in some field for this process (a process structure).
Do you have some good books about os dev at hands?
stay safe...
oh sweet jesus, this stack switching dark magic is something you have to do in the ISR prologue/epilogue. There are some tutorials at www.osdever.net which should explain this with code examples.
And you *have* to update esp0/ss0 in the global tss for each task for that the processor knows which esp0 to take in case of an interrupt.
also this *switching-to-kernel-stack* has to happen in the isr-prologue, right after saving all registers to the stack0 of the interrupted process/saving the esp0 value in some field for this process (a process structure).
Do you have some good books about os dev at hands?
stay safe...
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
BlueillusionOS iso image
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:How exactly does the stack work?
Setting ESP to 0 is usually a bad manner in protected mode. It is very unlikely that your stack segment is 4GB wide *and* that address 0xFFFF.FFFE is the valid top of your stack.Candy wrote: Q: Wouldn't you get a stack overflow (kinda forgot how that worked...) by setting SP to 0?
Stack overflowing in PM works by checking for a page fault in the stack region right? any specific info I missed?
Stack overflow can be detected by a "guard page", but you can also use the segmentation protection by using an expand-down stack segment which limit will be computed so that the "valid stack addresses" start at the limit (and go up to 0xFFFFFFF) ...
The advantage of this technique over pure paging protection is that you still can handle a stack overflow for the kernel stack if you dedicate a TSS to the handling of Stack Fault exception.
Re:How exactly does the stack work?
This thread already has ventured beyond the initial question, but I usually try to explain "why" along with "what" because it makes understanding so much easier:Candy wrote: If I pushed a 32-bit value, the stack pointer would go to 0x7FFC. Would the byte at 0x8000 exact be overwritten or not?
If your stack starts at 0x8000, and you push a 32-bit value, you'd expect that value to be longword-aligned, wouldn't you?
That's "why" 0x8000 isn't used...
Every good solution is obvious once you've found it.
Re:How exactly does the stack work?
I was planning on using the cpl0 stack of the process for the eax-edi (or rax - r15) values of the process, by starting with a pushad-like function (different something for amd64 since they ripped out pushad...) Either the system pushes them automatically on interrupt (such as in real mode, which I expected) or it does not, upon which the ones from the current stack are used.hmm. i'm unsure on how you plan to save eax-edi on the user stack, but this is in no way something you can do easily with an interrupt handler -- except if you check by software the SS:ESP value that has been saved by the CPU on your DPL0 stack and manually store the register values there ... however, unless you also deroute the instruction stream to a user-level "popa" instruction ...
Returning to the function would be like popa; iret.
The idea about having a different cpl0 stack for each process is so that each process functions completely independantly from the other processes, and most importantly, the kernel does not suffer from whatever the user process might or might not do.uuh ? well, you were already on a CPL0 stack, so what's the plan for an additionnal "kernel stack" here ? can't the CPL0 stack of each process be the "kernel stack" ?
because that would go straight against my design philosophy, mechanism separate from policy, switching the task itself is policy and therefore a separate function should be called from that space (thereby necessitating a stack) for the CR3 switch. Also, for code structure, I would like to use the kernel stack for all kernel interrupts, this one being one of the primary ones. This guarantees for me that the stack is safe from anything related to the user process. Might just be superstition but I think it's necessaryor maybe you expect the kernel stack to be available in every process while CPL0 process-specific stacks are not available globally ... hmm ... but why not just assuming that you will not access the stack between the stack-switch code (between P1.dpl0_stack and P2.dpl0_stack) and CR3 reprogramming ?
okay... didn't notice that... still, don't think they (need to) change actually... if all's well the stack will only be used for the kernel functions called by the current execution context, and everytime it returns the stack will be identical..watch out: there's nothing like an automatic restore of TSS.ESP0 or TSS.SS0 by the computer: these fields are *static*, which means the CPU reads them at interrupt handling, but it never writes to them automatically.
part 2 follows
Re:How exactly does the stack work?
oh sweet jesus, this stack switching dark magic is something you have to do in the ISR prologue/epilogue. There are some tutorials at www.osdever.net which should explain this with code examples.
Code: Select all
void sched__switch_wrap() {
// switch to scheduler stack
asm ("movl %%esp, %0; movl %1, %%esp": "=c"(ct->esp) : "a"(kernel_t.esp));
ct->status = SCHED_SUSP;
...
// switch stack back
asm ("movl %%esp, %0; movl %1, %%esp": "=c"(kernel_t.esp) : "a"(ct->esp));
sti();
if (ct->status == SCHED_SUSP) {
ct->status = SCHED_RUNNING;
asm ("addl $8, %esp; popl %ebp; iretl");
Okay, didn't know that, will remember that when starting programming of code that starts cpl3 tasks...And you *have* to update esp0/ss0 in the global tss for each task for that the processor knows which esp0 to take in case of an interrupt.
k... didn't save the registers yet... *slaps head* thxalso this *switching-to-kernel-stack* has to happen in the isr-prologue, right after saving all registers to the stack0 of the interrupted process/saving the esp0 value in some field for this process (a process structure).
The indispensable PC hardware book 3rd + 4th edition, Modern Operating Systems (Andy), Operating systems backgrounds mechanism & design (stallings) and the series of processor books from AMD (paperback! )Do you have some good books about os dev at hands?
Setting ESP to 0 is usually a bad manner in protected mode. It is very unlikely that your stack segment is 4GB wide *and* that address 0xFFFF.FFFE is the valid top of your stack.
I was planning on using paging, and no segmentation, because of the second platform I want to support (AMD64) which does not support segmentation. (and possible the third, IA64 which has no support either) Stack underflow will be detected by an unmapped page, stack overflow by dynamic calculation of the stack size, and by not mapping in another page (paging code)Stack overflow can be detected by a "guard page", but you can also use the segmentation protection by using an expand-down stack segment which limit will be computed so that the "valid stack addresses" start at the limit (and go up to 0xFFFFFFF) ...
Would be nice, but not using them allows easy porting to the other platforms. Again, I don't want to use a TSS for anything (anymore...), but using a separate stack for the stack fault exception would be an option. Having the same effect and size of the TSS option, it just doesn't matterThe advantage of this technique over pure paging wwwprotection is that you still can handle a stack overflow for the kernel stack if you dedicate a TSS to the handling of Stack Fault exception.