Switching to Ring 0 task with iret

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Switching to Ring 0 task with iret

Post by thewrongchristian »

Lagor wrote:It took me a few answers from you kind people but I feel I'm getting it now.

Two more questions, is it mandatory to setup a kernel stack for each user process? I think I've heard the some os don't do this but I might be wrong.
If I choose (or have) to have a kernel stack for every user process, does this mean that I need to manage a ESP0 value for each process it its TCB and then update accordingly the value in the only TSS that my system has? If this is the case, I have a few issues picturing the situation in my mind but I'll deal with that later...

Cheers,
F
If you have a minimal microkernel, which just passes messages between user level processes, then you can probably have a single kernel stack, which just switches between user contexts. Basically, a send message system call would become:
  • Client calls send message system call - switch to kernel mode
  • Kernel, on kernel stack, copies message from client user context
  • Kernel, on same kernel stack, switches to the server context
  • Kernel, delivers message to the server, which has previously executed get message call to get the next message
  • Server returns from get message system call, does the work, then sends the message back, repeating the above steps.
The point being, all the process can share the kernel context if the kernel context will not block, or blocks only when there are no messages to deliver (in which case, there is no work to be done.)

You can also do lightweight concurrency on a single stack using something like protothreads, as used in the contiki OS. But I think that's more aimed at memory constrained systems where multiple kernel stacks is not practical.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Switching to Ring 0 task with iret

Post by thewrongchristian »

Lagor wrote: So how would that work exactly? I have one TSS (only 1 cpu) that holds one value of esp0 that gets loaded into esp every time an int gets called. Who's the owner of that stack?
Not every time an int is called. It's only loaded when an interrupt causes a privilege transition, such as from user mode (ring 3) to kernel mode (ring 0.)

Otherwise, if you're already in kernel mode when the interrupt occurs, the value in tss.esp0 is ignored.
Lagor wrote: And how and when (and by whom) it gets updated/restored?
It gets updated at the task switch. Here is my task switch (I use setjmp/longjmp to switch between tasks):

Code: Select all

void arch_thread_switch(thread_t * thread)
{
        thread_t * old = arch_get_thread();

        if (old == thread) {
                /* Old and new thread are the same thread - no switch */
                thread->state = THREAD_RUNNING;
                return;
        }

        if (old->state == THREAD_RUNNING) {
                old->state = THREAD_RUNNABLE;
        }

        if (old->process != thread->process) {
                if (thread->process) {
                        /* switch user address space here */
                        vmap_set_asid(thread->process->as);
                } else {
                        vmap_set_asid(0);
                }
        }

        if (0 == setjmp(old->context.state)) {
                if (thread->state == THREAD_RUNNABLE) {
                        thread->state = THREAD_RUNNING;
                }
                /* TSS.ESP0 is updated here */
                tss[1] = (uint32_t)thread->context.stack + ARCH_PAGE_SIZE;
                current = thread;
                longjmp(thread->context.state, 1);
        }
}
Lagor wrote: If you have any link that could explain this in detail, I'd appreciate it very much.
Thank you.
Actually, I can't find a reference that puts it succinctly but in sufficient detail. My code referenced above is a bit rough and ready, lacking sufficient comments to act as a reference.

I think the critical detail you're missing is when tss.esp0 is used, and that is when a privilege transition occurs. If you're already in kernel mode when an interrupt is processed, tss.esp0 is ignored.
Lagor
Posts: 23
Joined: Mon Nov 25, 2019 3:34 pm

Re: Switching to Ring 0 task with iret

Post by Lagor »

Awesome, thank you. I think I finally understood.
Last issue I have is:
Does anybody ever change the value of esp0 of a given thread?
If not, would it mean that the kernel context stack of the thread gets essentially reset at every privilege transition?
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Switching to Ring 0 task with iret

Post by thewrongchristian »

Lagor wrote:Awesome, thank you. I think I finally understood.
Last issue I have is:
Does anybody ever change the value of esp0 of a given thread?
I can't speak for anyone else, but I don't change the value used for tss.esp0 for the lifetime of the thread. When a thread is switched to, tss.esp0 is updated with that thread's kernel stack location, but that value is always the same for a particular thread, and I can't envisage a scenario that would require the value to change.
Lagor wrote: If not, would it mean that the kernel context stack of the thread gets essentially reset at every privilege transition?
Yes.

It's not a problem the kernel %esp being reset at every privilege transition, because if we're switching to ring 0, by definition, were were not previously in ring 0 and the kernel portion of the thread has no context.

We switch to ring 0 when:
  • The user process executes a system call or faults, so the user system call/fault provides the new kernel context with all it needs to service the system call/fault, along with the global kernel static and dynamic state (kernel variables etc.).
  • An interrupt comes in from hardware, in which case the hardware provides the kernel context with all it needs to service the interrupt, along with the global kernel static and dynamic state (kernel variables etc.).
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Switching to Ring 0 task with iret

Post by rdos »

Lagor wrote:Awesome, thank you. I think I finally understood.
Last issue I have is:
Does anybody ever change the value of esp0 of a given thread?
If not, would it mean that the kernel context stack of the thread gets essentially reset at every privilege transition?
The only scenario I can think of that would require this is if you have a DPMI server interface, which is no longer of much use. In that case, you need to save the state on the stack when doing interrupt reflections.
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Switching to Ring 0 task with iret

Post by nexos »

Lagor wrote:Two more questions, is it mandatory to setup a kernel stack for each user process?
Most OSes setup a kernel stack per thread, but it is theoretically possible to have one kernel stack per CPU. I believe Mach did that. In most cases, I would recommend having one kernel stack per thread.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
Lagor
Posts: 23
Joined: Mon Nov 25, 2019 3:34 pm

Re: Switching to Ring 0 task with iret

Post by Lagor »

Thank you everyone.
I finally managed to wrap my head around these concepts and successfully implement kernel/user threads and context switches.

Cheers,
F
Lagor
Posts: 23
Joined: Mon Nov 25, 2019 3:34 pm

Re: Switching to Ring 0 task with iret

Post by Lagor »

Hi everyone,
I have another question regarding ring0 and ring3 threads.
Since I'd like to support both, is it correct to assume that depending on the type of thread selected to be switched-to next, I have to write different ways to switch task?

My current setup is as follows:
1) the timer scheduler that gets called
2) a bunch of values (state of the current running "stuff") are pushed onto the stack by the CPU
3) An asm function switches to the next thread by saving the current thread context and loading another one

when the interrupted task gets to run again, it resumes just after its switch_to_task (that gave the cpu to another context) and then returns from the interrupt by doing an IRET using the stuff previously pushed by the CPU (and unchanged).

First question: does this seem resonable?
Second: depending on the type of thread to be executed, whether i want it to run at ring0 or ring3, do i have to use RET/IRET accordingly?
What about the return from interrupt? I'm using IRET without touching the stuff pushed by the CPU. Do i need to take care of things differently there as well?

Basically I'm worried about having to take care of all the possible Protection Level transions.
I have interrupt code running at ring0 that might switch to either a ring0 code or ring3 code and so forth.
As usual, thank you for your time.

Cheers,
F
Octocontrabass
Member
Member
Posts: 5567
Joined: Mon Mar 25, 2013 7:01 pm

Re: Switching to Ring 0 task with iret

Post by Octocontrabass »

Lagor wrote:Since I'd like to support both, is it correct to assume that depending on the type of thread selected to be switched-to next, I have to write different ways to switch task?
Nope! Task switching always happens in ring 0, so your scheduler doesn't need to know whether a task will stay in ring 0 or return to ring 3.
Lagor wrote:First question: does this seem resonable?
No. A lot of tasks will yield before any timer interrupts happen, so you need a way to switch tasks that doesn't depend on interrupts. Once you have that, your timer interrupt can use it too.
Lagor wrote:Second: depending on the type of thread to be executed, whether i want it to run at ring0 or ring3, do i have to use RET/IRET accordingly?
What about the return from interrupt? I'm using IRET without touching the stuff pushed by the CPU. Do i need to take care of things differently there as well?
Task switching always happens in ring 0, so switch_to_task always uses RET. The task itself will use IRET (or SYSRET or SYSEXIT) if it needs to be in ring 3. As far as switch_to_task is concerned, there is no such thing as a new task: every task it switches to is an existing task that will be resumed. If you want to start a new task, you create the appropriate context so the new task will run in the correct ring when it resumes.

The only special handling you need to do in the timer interrupt is making sure you finish acknowledging it before you call switch_to_task, since there's no guarantee the next task will resume inside the timer interrupt handler.
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Switching to Ring 0 task with iret

Post by rdos »

Octocontrabass wrote:
Lagor wrote:Since I'd like to support both, is it correct to assume that depending on the type of thread selected to be switched-to next, I have to write different ways to switch task?
Nope! Task switching always happens in ring 0, so your scheduler doesn't need to know whether a task will stay in ring 0 or return to ring 3.
Exactly, and the only difference between processes is which CR3 they use. That means the task switcher must update CR3 if the new task is in another process.
Lagor
Posts: 23
Joined: Mon Nov 25, 2019 3:34 pm

Re: Switching to Ring 0 task with iret

Post by Lagor »

Octocontrabass wrote:
Lagor wrote:Since I'd like to support both, is it correct to assume that depending on the type of thread selected to be switched-to next, I have to write different ways to switch task?
Nope! Task switching always happens in ring 0, so your scheduler doesn't need to know whether a task will stay in ring 0 or return to ring 3.
Ok, I can live with that.
Lagor wrote:First question: does this seem resonable?
No. A lot of tasks will yield before any timer interrupts happen, so you need a way to switch tasks that doesn't depend on interrupts. Once you have that, your timer interrupt can use it too.
Yes, this I understand. For the moment i was just trying to get the timer interrupt logic right.
Lagor wrote:Second: depending on the type of thread to be executed, whether i want it to run at ring0 or ring3, do i have to use RET/IRET accordingly?
What about the return from interrupt? I'm using IRET without touching the stuff pushed by the CPU. Do i need to take care of things differently there as well?
Task switching always happens in ring 0, so switch_to_task always uses RET. The task itself will use IRET (or SYSRET or SYSEXIT) if it needs to be in ring 3.
This added a lot of confusion to me.
If i switch to a task with a RET while i'm in the interrupt handler, the flow will be directed to the new task code and, in a way, the interrupt handler code won't return (yet). Is this right?
So let's say my task was interrupted before a printf instruction, when i resume it from interrupt code running at ring0, i need to setup the stack, the eip (printf instruction) and CS so that when the RET executes, we're running the task in ring3 with the appropriate stack and at the right instruction.
I absolutely don't understand what you mean by 'the task itself need to put itself into the correct privilege level'
As far as switch_to_task is concerned, there is no such thing as a new task: every task it switches to is an existing task that will be resumed. If you want to start a new task, you create the appropriate context so the new task will run in the correct ring when it resumes.

The only special handling you need to do in the timer interrupt is making sure you finish acknowledging it before you call switch_to_task, since there's no guarantee the next task will resume inside the timer interrupt handler.

Yes, this i do but as the last instruction of the timer interrupt handler. If i switched task before that, does this mean that piece of code will not get called?
I have to say, i thought i was almost there, I have perfectly running ring0 tasks but as soon as i wanted to mix them up with some ring3 ones, things started not to work.
Cheers,
F
Octocontrabass
Member
Member
Posts: 5567
Joined: Mon Mar 25, 2013 7:01 pm

Re: Switching to Ring 0 task with iret

Post by Octocontrabass »

Lagor wrote:This added a lot of confusion to me.
If i switch to a task with a RET while i'm in the interrupt handler, the flow will be directed to the new task code and, in a way, the interrupt handler code won't return (yet). Is this right?
That's right. You have to make sure your interrupt handler is done acknowledging the hardware before you switch tasks.
Lagor wrote:So let's say my task was interrupted before a printf instruction, when i resume it from interrupt code running at ring0, i need to setup the stack, the eip (printf instruction) and CS so that when the RET executes, we're running the task in ring3 with the appropriate stack and at the right instruction.
I absolutely don't understand what you mean by 'the task itself need to put itself into the correct privilege level'
A task running in ring 3 is interrupted right before a printf call.
The interrupt handler, running in ring 0, calls the task switch function.
The task switch function switches to a different ring 0 stack, and returns to whatever is on top of that stack.
Eventually, the task switch function gets called again, switches back to this task's ring 0 stack, and returns to the interrupt handler.
The interrupt handler returns to the printf call in ring 3.

You don't need to set anything up because it's already there on the ring 0 stack. Each task has its own stack and you're just switching between them.
Lagor wrote:Yes, this i do but as the last instruction of the timer interrupt handler. If i switched task before that, does this mean that piece of code will not get called?
It will get called when the task resumes. If the next task switch won't happen until after that code runs, then it'll never resume!
Post Reply