Switching to Ring 0 task with iret
Switching to Ring 0 task with iret
Hi all, after a bit of a hiatus I went back to the development of my Os.
I did a pretty good memory manager that seems to work well and now, the next logic step for me, would be to implement processes.
Since I have no filesystem from which to load executables, I figured I'd start by implementing kernel (ring 0) processes. The thought process behind this is that the code to be executed is just the code of one of my functions that already gets compiled with my kernel and also, since the only privilege level I'll be dealing with is 0, I don't need to bother with user processes Page Directories and such. I'll just use the one setup by the kernel for its own usage.
My process struct is basically a string for the name, an int for a pid, a pointer to a page dir (the same used for the kernel), some virtual memory regions allocated to the process (again, the same I use for the kernel), a struct containing all the registers values (notably eip for context switching) and a pointer to be used for the 'per process' stack.
I have setup and tested all the stuff that orbits around the actual running of the processes such as a scheduler, an interrupt that gets triggered every n milliseconds that will call the scheduler and all the logic of creating, killing and putting processes to sleep.
The problem that i have is that, at the moment, my timer/scheduler interrupt handler only goes from kernel code (outside any process context) to some other kernel code (also outside process context) and it's implemented with iret.
This is what I understand it's happening:
clock interrupt gets triggered, no matter what was happening at the time, the cpu pushes a bunch of stuff relative to the 'interrupted' context so that i can restore it later.
On top of the stuff already pushed by the cpu, I push some more stuff and call my handler routine.
My handler will save the info pushed on the stack earlier (as they are related to something that was running and, possibly, needs to be scheduled away), will choose a process to run and 'substitute' its values on the stack so that iret will switch to this process' context instead of the one previously interrupted.
If I didn't talk BS up until this point, here goes my problem.
Afaik, iret will NOT pop the ss:esp from the stack if the new process' rpl is the same as my current.
Since I'm already in ring 0 and I want to switch to some other code running in ring 0, no stack will be restored and, therefore, my new process will be executing using the stack that was already used by my kernel.
I managed to successfully execute my plan of running/stopping/sleeping/restoring some kernel code contained in my kernel (elf file) with the stack that i generally use for my 'regular kernel'.
This stops working after I run a few processes as, I assume, they all share the same stack and DO NOT clean up after themselves (example: if I get interrupted before i return from a call) resulting in corruptions and wrong memory regions being accessed.
How am I supposed to do this? Do I need to move away from iret?
More importantly, am I missing some fundamental theory here?
Any help that would shed more light on this would be very much appreciated.
Cheers,
F
I did a pretty good memory manager that seems to work well and now, the next logic step for me, would be to implement processes.
Since I have no filesystem from which to load executables, I figured I'd start by implementing kernel (ring 0) processes. The thought process behind this is that the code to be executed is just the code of one of my functions that already gets compiled with my kernel and also, since the only privilege level I'll be dealing with is 0, I don't need to bother with user processes Page Directories and such. I'll just use the one setup by the kernel for its own usage.
My process struct is basically a string for the name, an int for a pid, a pointer to a page dir (the same used for the kernel), some virtual memory regions allocated to the process (again, the same I use for the kernel), a struct containing all the registers values (notably eip for context switching) and a pointer to be used for the 'per process' stack.
I have setup and tested all the stuff that orbits around the actual running of the processes such as a scheduler, an interrupt that gets triggered every n milliseconds that will call the scheduler and all the logic of creating, killing and putting processes to sleep.
The problem that i have is that, at the moment, my timer/scheduler interrupt handler only goes from kernel code (outside any process context) to some other kernel code (also outside process context) and it's implemented with iret.
This is what I understand it's happening:
clock interrupt gets triggered, no matter what was happening at the time, the cpu pushes a bunch of stuff relative to the 'interrupted' context so that i can restore it later.
On top of the stuff already pushed by the cpu, I push some more stuff and call my handler routine.
My handler will save the info pushed on the stack earlier (as they are related to something that was running and, possibly, needs to be scheduled away), will choose a process to run and 'substitute' its values on the stack so that iret will switch to this process' context instead of the one previously interrupted.
If I didn't talk BS up until this point, here goes my problem.
Afaik, iret will NOT pop the ss:esp from the stack if the new process' rpl is the same as my current.
Since I'm already in ring 0 and I want to switch to some other code running in ring 0, no stack will be restored and, therefore, my new process will be executing using the stack that was already used by my kernel.
I managed to successfully execute my plan of running/stopping/sleeping/restoring some kernel code contained in my kernel (elf file) with the stack that i generally use for my 'regular kernel'.
This stops working after I run a few processes as, I assume, they all share the same stack and DO NOT clean up after themselves (example: if I get interrupted before i return from a call) resulting in corruptions and wrong memory regions being accessed.
How am I supposed to do this? Do I need to move away from iret?
More importantly, am I missing some fundamental theory here?
Any help that would shed more light on this would be very much appreciated.
Cheers,
F
Re: Switching to Ring 0 task with iret
How most OSes multitask kernel threads is quite different. Your method is the same as my old method, and many problems resulted. I could explain for you, but it is already covered in the wiki at https://wiki.osdev.org/Brendan%27s_Mult ... g_Tutorial
Re: Switching to Ring 0 task with iret
I think a more reasonable view of a process is that it is a new address space that happens to have a thread executing within it (and might create more). The initial thread of a new process is nothing special and should be viewed as similar to any other thread in the system.
I also think that before creating processeses you should have userspace functionality. A process basically is a new userspace address space executing in your normal (shared) kernel space. Processes in kernel space doesn't make a lot of sense since kernel space usually is mostly shared between processes.
I also think that before creating processeses you should have userspace functionality. A process basically is a new userspace address space executing in your normal (shared) kernel space. Processes in kernel space doesn't make a lot of sense since kernel space usually is mostly shared between processes.
Re: Switching to Ring 0 task with iret
Hi, thank you so much for your answer, I've not come across that tutorial yet.nexos wrote:How most OSes multitask kernel threads is quite different. Your method is the same as my old method, and many problems resulted. I could explain for you, but it is already covered in the wiki at https://wiki.osdev.org/Brendan%27s_Mult ... g_Tutorial
I still have some doubts though, mainly about the context switch itself.
Basically, all I need to solve my problem is in the first part, the little piece of asm code that sets up the next task and rets to it.
It kinda makes sense to me if the switch_to_task gets called by a running task itself. In his tutorial, he starts with a coop multitasking so the assumptions would be true. The code pushes register onto its own stack and then saves esp into its task struct so to be restored later.
I struggle to understand how this could work if executed by the PIT handler. He calls switch_to_task inside of his schedule() function.
Wouldn't this just save the out-of-context-kernel-timer-handler-registers into the current running process? If this is correct, why?
Cheers,
F
Re: Switching to Ring 0 task with iret
One thing to know is an interrupt is different from a context switch. What happens with the PIT is that the interrupt handler gets called, and then the register state gets saved. It then calls a function that checks if there is task that is running should be preempted. If so, it calls switch_to_task. Then, when the new task runs, it restores its interrupted context. This may seem slower then what you are doing right now, but, most context switches end up occuring because a task pauses itself, and in this case, this method is faster.
Re: Switching to Ring 0 task with iret
I have to say, I'm a little confused. I understood things just the way you explained them but I don't see how the code in the tutorials implements these concepts.nexos wrote:One thing to know is an interrupt is different from a context switch. What happens with the PIT is that the interrupt handler gets called, and then the register state gets saved. It then calls a function that checks if there is task that is running should be preempted. If so, it calls switch_to_task. Then, when the new task runs, it restores its interrupted context. This may seem slower then what you are doing right now, but, most context switches end up occuring because a task pauses itself, and in this case, this method is faster.
When a PIT interrupt gets triggered, CPU uses the IDT to locate the eip of my timer-handler and then uses the value contained in TSS.esp0 to setup the stack for my kernel code (which should be my responsibility to be kept uptaded every time I use the stack in an int-handler right?).
Now, correct me if I'm wrong, my timer-handler will never return to the code immediatly after. This is because i setup a stack with the eip of the next task and then ret to it.
So in the example i have interrupt code -> schedule() -> switch_to_task() and if I understood correctly, switch_to_task will not just return to schedule which will return to the interrupt code but will, instead, jump into the new task code.
Maybe I can't read assembly that well but in the prologue of the switch_to_task there's this code:
Code: Select all
push ebx
push esi
push edi
push ebp
mov edi,[current_task_TCB] ;edi = address of the previous task's "thread control block"
mov [edi+TCB.ESP],esp
Besides, since switch_to_task gets called by schedule(), wouldn't ebx, esi and so forth contain the values as set by the callee (schedule())?
Shouldn't we save the values of the registers that we got as soon as the interrupt occurred (aka, values of regs from the previously, and now preempted, task) ?
Cheers,
F
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Switching to Ring 0 task with iret
It works like an ordinary function call. Eventually switch_to_task() will return to its caller. The caller expects those registers to be unchanged, so switch_to_task() has to save them.
Let's say thread A calls switch_to_task() to switch to thread B. Then thread B runs for a while and calls switch_to_task() to switch to thread A. From thread A's perspective, all that happened is that it called switch_to_task(), some time passed, and switch_to_task() returned.
Let's say thread A calls switch_to_task() to switch to thread B. Then thread B runs for a while and calls switch_to_task() to switch to thread A. From thread A's perspective, all that happened is that it called switch_to_task(), some time passed, and switch_to_task() returned.
Re: Switching to Ring 0 task with iret
This is what I do not agree with.Octocontrabass wrote:It works like an ordinary function call. Eventually switch_to_task() will return to its caller. The caller expects those registers to be unchanged, so switch_to_task() has to save them.
Let's say thread A calls switch_to_task() to switch to thread B. Then thread B runs for a while and calls switch_to_task() to switch to thread A. From thread A's perspective, all that happened is that it called switch_to_task(), some time passed, and switch_to_task() returned.
switch_to_task does NOT get called by a thread. It gets called by my interrupt handler running in ring 0.
As i understand it, interrupt handlers are not processes nor threads or tasks or however you wanna call them. They are just kernel code that runs outside any context or, if you will, in interrupt context.
This is why i don't understand why we save some data that belongs to the kernel into some process tcb.
I am not developing a microkernel nor a cooperative multitasking system.
I am trying to write a monolithic kernel with a preemptive system and, possibly, supporting kernel threads.
Tell me if what I just wrote is totally wrong because between not getting the technicalities right and having, perhaps, a theory gap, I'm going nuts
Cheers,
FG
Re: Switching to Ring 0 task with iret
Hi,
switch_to_task would get called by the current thread. Kernel code is shared between all threads that enter it. Just because a thread enters kernel code does not make it no longer a thread.
switch_to_task would get called by the current thread. Kernel code is shared between all threads that enter it. Just because a thread enters kernel code does not make it no longer a thread.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
Re: Switching to Ring 0 task with iret
Hi, can you elaborate a little more on that? I'm very confused.
How can a thread call switch_to_task? Isn't that what happens in coop multitasking?
What if a thread never calls it? Does it run on the cpu forever?
When an int gets triggered, the cpu jumps to the entry point of my handler (found in the IDT) and restores the stack that finds in tss.esp0. To me, that is no longer the thread being executed, it is my handler.
So in any call to any function, the callee is my int handler.
The cpu doesn't know anything about processes, it's just an abstraction that I created in software, essentially by updating a structure that contains info about the current running thread.
Is this totally off?
Thank you (all) for your time.
Cheers,
F
How can a thread call switch_to_task? Isn't that what happens in coop multitasking?
What if a thread never calls it? Does it run on the cpu forever?
When an int gets triggered, the cpu jumps to the entry point of my handler (found in the IDT) and restores the stack that finds in tss.esp0. To me, that is no longer the thread being executed, it is my handler.
So in any call to any function, the callee is my int handler.
The cpu doesn't know anything about processes, it's just an abstraction that I created in software, essentially by updating a structure that contains info about the current running thread.
Is this totally off?
Thank you (all) for your time.
Cheers,
F
-
- Member
- Posts: 426
- Joined: Tue Apr 03, 2018 2:44 am
Re: Switching to Ring 0 task with iret
I think what is confusing you is how tss.esp0 is used. %esp is set to the value of tss.esp0 only when a privilege transition occurs. For example, if you're in ring 3, and the PIT fires, the IRQ transitions the CPU to ring 0 to handle the interrupt, and that transition loads the new %esp from tss.esp0 (privilege transition from ring 3 to ring 0.)Lagor wrote:Hi, can you elaborate a little more on that? I'm very confused.
How can a thread call switch_to_task? Isn't that what happens in coop multitasking?
What if a thread never calls it? Does it run on the cpu forever?
When an int gets triggered, the cpu jumps to the entry point of my handler (found in the IDT) and restores the stack that finds in tss.esp0. To me, that is no longer the thread being executed, it is my handler.
So in any call to any function, the callee is my int handler.
The cpu doesn't know anything about processes, it's just an abstraction that I created in software, essentially by updating a structure that contains info about the current running thread.
Is this totally off?
Thank you (all) for your time.
Cheers,
F
If you're already in ring 0, because you have kernel thread code running, then no privilege transition occurs, and %esp is just used as is and the interrupt context is formed on the existing kernel stack.
Once tss.esp0 is set for a thread, it doesn't need to change until you switch to another thread, if it has its own kernel thread. Part of the task switch on x86 is to update tss.esp0 to point to the top of the desired kernel stack for the thread.
Also, I'd get out of your head that an interrupt handler is a separate thread. Sure, on x86, you can have task gates in the IDT, so that an interrupt does trigger a hardware task switch, but it's a non-portable construct, too complex and too slow. So it's best to think of an interrupt handler having an undefined context, it operates within the context of the thread that was interrupted. As such, it can rely on little other than static or global context.
Re: Switching to Ring 0 task with iret
Hi,
Yep. Except with preemptive multitasking, the thread is told when its time is up. The interrupt interrupts the current thread so it continues execution in the interrupt handler. The only time a new thread will execute is when a new thread context is pushed on the stack for IRET. Each thread has its own user space and kernel space stack.How can a thread call switch_to_task? Isn't that what happens in coop multitasking?
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
Re: Switching to Ring 0 task with iret
It took me a few answers from you kind people but I feel I'm getting it now.
Two more questions, is it mandatory to setup a kernel stack for each user process? I think I've heard the some os don't do this but I might be wrong.
If I choose (or have) to have a kernel stack for every user process, does this mean that I need to manage a ESP0 value for each process it its TCB and then update accordingly the value in the only TSS that my system has? If this is the case, I have a few issues picturing the situation in my mind but I'll deal with that later...
Cheers,
F
Two more questions, is it mandatory to setup a kernel stack for each user process? I think I've heard the some os don't do this but I might be wrong.
If I choose (or have) to have a kernel stack for every user process, does this mean that I need to manage a ESP0 value for each process it its TCB and then update accordingly the value in the only TSS that my system has? If this is the case, I have a few issues picturing the situation in my mind but I'll deal with that later...
Cheers,
F
Re: Switching to Ring 0 task with iret
Hi,
Stacks are not per process they are per thread. Each thread must have its own kernel stack.
Stacks are not per process they are per thread. Each thread must have its own kernel stack.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
Re: Switching to Ring 0 task with iret
Yes yes, when I said process I think i meant task or, in general, any thread of execution that has its own context.
So how would that work exactly? I have one TSS (only 1 cpu) that holds one value of esp0 that gets loaded into esp every time an int gets called. Who's the owner of that stack? And how and when (and by whom) it gets updated/restored?
If you have any link that could explain this in detail, I'd appreciate it very much.
Thank you.
Cheers,
F
So how would that work exactly? I have one TSS (only 1 cpu) that holds one value of esp0 that gets loaded into esp every time an int gets called. Who's the owner of that stack? And how and when (and by whom) it gets updated/restored?
If you have any link that could explain this in detail, I'd appreciate it very much.
Thank you.
Cheers,
F