OSDev.org

Posted: **Fri Aug 03, 2007 4:25 am**

Many OS's load some sort of object file and then allocate the stack per each thread.
I am considering making the threads make their own stacks, or use no stack.
This would allow the programmer to do what ever they want and make the stack as big or small as they want.

I am going to use software timer interupt driven task switching- each process has a page directory and each process makes threads that can have their own stack. All processes virtual address spaces start at 0 and extend to 4gb.

Each process has a structure (127 max processes) containing a id byte, status byte, page directory, bunch of blocks detailing the page location in ram or hardrive, and 127 thread structres
each thread structure has
id byte
status byte
sleep timer byte
storage for registers

the task switcher does

Code: Select all

loops through process structures
    loops through thread structures
        thread is ready to execute (see below)
        or sleep timer is not 0 yet
           subtract 1 from sleep timer
    continue looping though threadsuntil you reach 127
continue looping until you reach 127
hlt (idle process, the timer irq handler does task switching)

thread is ready to execute
    set page directory cr3 to the process's pd and load threads registers
    return (iret)

When the task switcher is ran if first switches to kernel page directory, saves the registers, sets up its own stack (1 task switcher/interrupt handler stack per cpu to avoid conflicts)
and enters the loop above.

Notice how this software task switcher avoids the assumption that task will have stacks, but it does save the registers so it still saves the stack if it exists.

The task switcher is a routine called by the system timer handler and can be called by a software interupt handler (who can be called by the task to implement sleep() ) as well as other software interupt handlers such as io (the thread wants to send io, it first ask the os for a handle to a device, then it can read input at anytime, and if it request write access it can call the interupt who switches the tasks so eventually the device driver task (think microkernel) can process it and return a value into the read stream by the time the requesting task resumes- heh thats long).

Anyways this will be my first os so don't assume I know what I am talking about.

Posted: **Fri Aug 03, 2007 5:10 am**

Hi,

A few comments:

Gizmo wrote: I am considering making the threads make their own stacks, or use no stack.

If each task has a space of (say) 3GB and you place the user stack at the top of task space, you potentially have a 3GB stack. As long as you keep track of where the stack boundaries are, you can dynamically assign stack pages as necessary. I would suggest this is a better method because you are then able to pass initial values to your task. For example, if you want all your processes to start with:

Code: Select all

int main(int argc, char **argv)
{
}

you can just place argc and argv on the stack when you create the new process. You can also provide a return value for tasks which do not exit() nicely. You may want to use this system even if you have a C runtime set up, because you can just pass argc and argv on to the 'real' main function.

Gizmo wrote: I am going to use software timer interupt driven task switching- each process has a page directory and each process makes threads that can have their own stack. All processes virtual address spaces start at 0 and extend to 4gb.

Are you mapping the kernel space in to each task space? Lots of kernels seem to do this, because it means you do not have to manually switch to kernel space each time a privileged interrupt happens.

Gizmo wrote: Each process has a structure (127 max processes) containing a id byte, status byte, page directory, bunch of blocks detailing the page location in ram or hardrive, and 127 thread structres

I would suggest having space for more than 127 processes. Hardware task switching limits you to just under 8192 processes and some people consider this an unnecessary limit.

Gizmo wrote: each thread structure has
id byte
status byte
sleep timer byte
storage for registers

Sounds ok - I would just extend the sleep byte to a longer int. Supposing you are sleeping in mSec intervals, one byte only lets you sleep for half a second. Inevitably when you code it, you will find that the structure needs extending (there's always something you didn't think of

). Storage for registers is often done on the task's ring 0 stack.

Gizmo wrote: the task switcher does

Code: Select all

loops through process structures
    loops through thread structures
        thread is ready to execute (see below)
        or sleep timer is not 0 yet
           subtract 1 from sleep timer
    continue looping though threadsuntil you reach 127
continue looping until you reach 127
hlt (idle process, the timer irq handler does task switching)

Thread is ready to execute
    set page directory cr3 to the process's pd and load threads registers
    return (iret)

Not every thread 'slot' and every task 'slot' will always be filled. You will probably want something like a linked-list which can have a circular reference, so you do not do more checking than you need to. Remember that while you are scheduling the next task, nothing else is happening, so you need to get that next task running as soon as possible. Some people have the scheduler itself running as a separate task.

Gizmo wrote: When the task switcher is ran if first switches to kernel page directory, saves the registers, sets up its own stack (1 task switcher/interrupt handler stack per cpu to avoid conflicts)
and enters the loop above.

Sounds good, I think.

Gizmo wrote: Notice how this software task switcher avoids the assumption that task will have stacks, but it does save the registers so it still saves the stack if it exists.

Ah - I think I see what you are getting at now by "No Stack" earlier in your post.

When you get to having user tasks running in ring3 and the kernel in ring0, you need a ring 0 stack per task. This is because, when you are running in ring 3 and your scheduler interrupt occurs, the CPU will automatically (no intervention) do the following:

* Load ESP0 from the Task State Segment (TSS).
* Simultaneously load CS from the Interrupt Descriptor.
* Push the user ESP3 and SS (I forget which order).
* Push EFLAGS, CS and EIP
* Jump to your ISR.

This means that you need some kind of ring 0 stack. Moving the pushed values to another location simply wastes CPU cycles in a part of your OS which is going to be used a lot.

Gizmo wrote: The task switcher is a routine called by the system timer handler and can be called by a software interupt handler (who can be called by the task to implement sleep() ) as well as other software interupt handlers such as io (the thread wants to send io, it first ask the os for a handle to a device, then it can read input at anytime, and if it request write access it can call the interupt who switches the tasks so eventually the device driver task (think microkernel) can process it and return a value into the read stream by the time the requesting task resumes- heh thats long).

Good stuff.

Gizmo wrote: Anyways this will be my first os so don't assume I know what I am talking about.

Please don't assume I'm trying to pick holes in what you say for the sake of it. Hopefully it will just give you some food for thought.

Cheers,
Adam

Posted: **Fri Aug 03, 2007 5:57 am**

I would suggest having space for more than 127 processes. Hardware task switching limits you to just under 8192 processes and some people consider this an unnecessary limit.

Actually it would be 8192 tasks, which are really threads not processes- a process is an application which has 1 main thread and makes more threads.
I'm trying to keep the process structures small so my kernel has a small footprint.

Are you mapping the kernel space in to each task space? Lots of kernels seem to do this, because it means you do not have to manually switch to kernel space each time a privileged interrupt happens.

I'm not sure I want userland being able to read pages mapped to kernel. I guess I am paranoid

I would suggest this is a better method because you are then able to pass initial values to your task. For example, if you want all your processes to start with:

Code:

int main(int argc, char **argv)
{
}

you can just place argc and argv on the stack when you create the new process. You can also provide a return value for tasks which do not exit() nicely. You may want to use this system even if you have a C runtime set up, because you can just pass argc and argv on to the 'real' main function.

C programs don't start with main(), they start with ctr0() that calls main().
ctr0 can read from parameters perhaps if I map them to the end of the process's address space.

When you get to having user tasks running in ring3 and the kernel in ring0, you need a ring 0 stack per task. This is because, when you are running in ring 3 and your scheduler interrupt occurs, the CPU will automatically (no intervention) do the following:

* Load ESP0 from the Task State Segment (TSS).
* Simultaneously load CS from the Interrupt Descriptor.
* Push the user ESP3 and SS (I forget which order).
* Push EFLAGS, CS and EIP
* Jump to your ISR.

This means that you need some kind of ring 0 stack. Moving the pushed values to another location simply wastes CPU cycles in a part of your OS which is going to be used a lot.

How would you do task switching software based?
Have 1 tss and one kernel stack?
(I dont think I completely get the moving from ring 3 to ring 0 part yet)

[/quote]

Posted: **Fri Aug 03, 2007 7:17 am**

Gizmo wrote: Actually it would be 8192 tasks, which are really threads not processes- a process is an application which has 1 main thread and makes more threads.

Not really

. I think you will find there are different definitions about, but a task does not have to have more than one thread. In fact, you do not have to have any provision in your OS for threads at all - this is becoming rarer though. Generally a task (process) is defined by the fact it runs in its own memory space, whereas several threads of the same task all use the same memory space (with different stacks). A TSS allows you to provide separate memory spaces via CR3 value (paging) or LDT's (segmentation).

Gizmo wrote: I'm not sure I want userland being able to read pages mapped to kernel. I guess I am paranoid

That's ok - you can use the x86 built-in protection to prevent user-mode code addressing kernel pages in the same memory space (see the u/s bit of the page table entry / page directory entry).

Gizmo wrote: C programs don't start with main(), they start with ctr0() that calls main().
ctr0 can read from parameters perhaps if I map them to the end of the process's address space.

Absolutely, but I would suggest that one way of passing those parameters would be to do so via the stack.

Gizmo wrote: How would you do task switching software based?
Have 1 tss and one kernel stack?
(I dont think I completely get the moving from ring 3 to ring 0 part yet)

When any ring 0 interrupt happens during ring3 execution, the sequence above occurs (switch to ring 0 stack, push ESP, SS, EFLAGS, CS, EIP). All you have to do is push the segment registers and general purpose registers, switch to the stack of the incoming task, then pop all the values for that incoming task. An iret restores EIP, CS, EFLAGS, SS and ESP. As you suggest, you have one TSS, and change ESP0 in that TSS for each task switch (there are ways around even having to do this).

When you set up a new task, you just prepare a new stack in the format your task switch ISR expects to see.

Adam

Posted: **Fri Aug 03, 2007 10:16 am**

So basically during the execution of tasks, my tss contains the ring 0 kernel/interrupt handler stack in esp, and when an interrupt happens in a ring 3 task that task's stack and other regs gets swapped into the tss and whatever was stored in tss is now the stack for my handler?

How do you know if the task that was interrupted was ring 0 or ring 3 when you code the handler?- just use ring 3 for all tasks and do all ring 0 stuff inside of my isr's?

Posted: **Fri Aug 03, 2007 7:41 pm**

I'm not sure I want userland being able to read pages mapped to kernel. I guess I am paranoid

It's only just the way most other operating systems do it

.

You just make sure that the userland process's pages are set with their pages mapped as 'user' pages, and the kernel as 'supervisor' pages. Only ring0 can read/write supervisor pages.

Posted: **Sat Aug 04, 2007 2:24 am**

pcmattman wrote:Only ring0 can read/write supervisor pages.

I'm pretty sure rings 1&2 can also access them. Not that those are used a lot, but still...

Posted: **Sat Aug 04, 2007 3:49 am**

Gizmo wrote:So basically during the execution of tasks, my tss contains the ring 0 kernel/interrupt handler stack in esp, and when an interrupt happens in a ring 3 task that task's stack and other regs gets swapped into the tss and whatever was stored in tss is now the stack for my handler?

No. Using software-based switching, *you* push the gen. purpose and segment regs on to the ring 0 stack, swap stacks and then pop the regs of the new task. The only way the TSS is involved is that it contains the ESP0 that will get used when the interrupt occurs.

Gizmo wrote: How do you know if the task that was interrupted was ring 0 or ring 3 when you code the handler?- just use ring 3 for all tasks and do all ring 0 stuff inside of my isr's?

Often, it doesn't matter, but if you need to know you can check the pushed CS, or have a user/kernel flag in your task management structure. You will find the same interrupt handler works for an interrupted kernel or user task.

Adam

Posted: **Wed Aug 22, 2007 10:44 pm**

I guess I'm even more confused than Gizmo ...
I'm trying to write my taskswitcher right now, and this is exactly the issue I'm dealing with.

A timer IRQ0 fires. I was running in either ring 3 or ring 0, say. I collect the values of the GP registers of the interrupted thread, and store them away. I store DS, ES, FS, GS.

Now, I also need to store away the former EIP, Eflags, CS, SS, and ESP.
So, how many dwords do I pop off the stack? 3? 5? I can't leave them ON there! The values (and # of dwords!) are certainly all wrong for my next IRET. And I certainly can't have old, used-up values that were pushed by INTs growing forever on my stack.

Do I have to pop 2, mask the former CS to find out its ring #, and then selectively either pop 1 or 3 more? And, if 1, grab SS and ESP directly?
And equivalently for the situation of the new task I'm swapping TO? I need to check the CS value that I'm trying to apply, and decdie whether to push 3 or 5 dwords onto the stack, for the IRET to utilize?

And doesn't this all make it extremely difficult to reset the value of ESP when swapping to a ring0 task from a ring0 taskswitcher? POPA won't do it! So I have to "preswitch" to the new task's stack, BEFORE I load it up with the register values to POPA and then IRET? And pray like hell that nobody ever tries to reset the SS register in a ring0 task?

I'm very surprised that it seems like nobody else seems to have to deal with this almost-logically-inconsistent situation.

I've also been looking at trying to do this without using a POPA -- but I seem to be running into a true logical inconsistency in trying to get the DS register reset.

Posted: **Thu Aug 23, 2007 1:21 am**

bewing wrote:Now, I also need to store away the former EIP, Eflags, CS, SS, and ESP.
So, how many dwords do I pop off the stack? 3? 5? I can't leave them ON there!

Could you tell my why exactly you can't leave them on the stack? Maybe I (we) don't know something about your design. If there's only one stack in kernel space, you'll have to save/restore these values to/from somewhere else, of course. Most multitasking systems I know use a seperate stack in kernel space for each userspace thread, so in this case you could just leave these values on the kernel stack and switching the stack automatically switches the thread.

bewing wrote:And doesn't this all make it extremely difficult to reset the value of ESP when swapping to a ring0 task from a ring0 taskswitcher? POPA won't do it! So I have to "preswitch" to the new task's stack, BEFORE I load it up with the register values to POPA and then IRET?

A ring0 task would probably have its own stack in kernel space, right? So, same as above: store the values on the task's kernel stack, then switch to the next task and restore them from its own stack.

cheers
Joe

Posted: **Thu Aug 23, 2007 2:37 am**

I wonder if the misconception is that the stack is somehow a special region of memory? It isn't and you certainly don't have to switch to the stack in order to put the initial values on. Just do something like:

Code: Select all

unsigned long *s = (unsigned long*)(malloc(sizeofnewstack) + sizeofnewstack);
*s-- = SS;
*s-- = ESP;
*s-- = ELFAGS;
*s-- = CS; //CS PL must = SS PL
*s-- = EIP; //the new task's entry point
*s-- = DS; //appropriate PL data segment
*s-- = ES;
*s-- = FS;
*s-- = GS;
*s-- = EAX;
*s-- = ECX;
*s-- = EBX;
*s-- = EDX;
*s-- = ESP;
*s-- = EBP;
*s-- = ESI;
*s = EDI;

Of course, the general purpose registers may all just be NULL. Once this new task's stack is set up, just pass 's' to your task switcher, which can do the POPA, POP DS... and IRET as required.

Cheers,
Adam

Posted: **Thu Aug 23, 2007 7:12 am**

JoeKayzA wrote:Could you tell my why exactly you can't leave them on the stack? Maybe I (we) don't know something about your design. If there's only one stack in kernel space, you'll have to save/restore these values to/from somewhere else, of course. Most multitasking systems I know use a seperate stack in kernel space for each userspace thread, so in this case you could just leave these values on the kernel stack and switching the stack automatically switches the thread.

*blink blink* Duh. Woops.

Thanks Joe & Adam. I was simply under the impression that while a task was swapped out that I needed to save a copy of its "state" somewhere. But, of course, as you say -- for any task with its own stack, I can just leave the state dangling off the bottom of its stack while the task is swapped out. When I want to swap back, I just have to change ss/esp (with an LSS instruction, I suppose) -- then pop registers and iret, and I'm back in business. Each task will have the appropriate iret for its own ring, because that's how it got swapped out in the first place.

Posted: **Thu Aug 23, 2007 7:41 am**

Correct except on one point - you don't need to use the LSS instruction. Unfortunately I don't have my handbook handy to see what exactly that instruction is, but...

Code: Select all

pop eax     ; get the SS off the stack.
mov ss, ax
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax

That should switch it for you...

Posted: **Thu Aug 23, 2007 8:40 am**

That version of a switch would assume that no app was playing any games with its segment registers. I'm not sure I should assume that. (And I think I want to use separate segments for data and stack, anyway.) Forcing valid numbers into the segment registers would prevent GPFs from invalid segment register values, yes, but theoretically there is nothing to specifically prohibit an app from carefully modifying some of the segment registers and using them successfully.

(And LSS is the supposed "preferred" way to set both ESP and SS at the same time, from a memory location.)

The thing that confuses me now is that ring0 and ring3 apps store their INT/IRET info in different places.

A ring0 app will store EFLAGS/CS/EIP on its own stack. This is very nice.

If I only have one ring0 stack, then an interrupted ring3 app will store SS/ESP/EFLAGS/CS/EIP on the ring0 system stack! Isn't that correct? This is totally unacceptable -- I may not be returning to ring 3. I need that "state" info stored somewhere with the task -- preferably in the task's own memory space. So I have to remove it from the ring0 system stack, right? So I'm back with trying to determine whether I interrupted from CPL0 or 3.

-- Is there a convenient way to convert a paged userspace ESP stack pointer into a physical memory pointer??

use a seperate stack in kernel space for each userspace thread

OK, now I'm seeing what you meant by this -- so every single userspace thread that gets spawned allocates both a new stack in userspace AND a new stack in physical kernelspace? Gawd, that's horrifying. I don't want to waste that quantity of memory in kernelspace just to handle interrupts from every possible running thread in userspace.

Posted: **Thu Aug 23, 2007 10:28 am**

The user apps ESP, SS, CS, EIP all need to be stored on the kernel stack. Think about what could happen if these values were stored on the user stack. It would be possible for an application to change its permission level just by modifying cs and ss.

OSDev.org

allow task to make their own stacks

allow task to make their own stacks

Re: allow task to make their own stacks