Task switching difficulties between user and kernel
Task switching difficulties between user and kernel
Hi.
I'm experimenting with user mode to kernel mode and vice versa switches. There are some pitfalls that I would like to ask and hopefully get some answers.
The processor is in protected mode.
1. It shows setting a valid TSS is necessary for the switch. If I simply inactive the TSS, any trial to change from user mode to kernel mode give me a #GP.
So literally, you would need such a TSS, in first place, to be able to switch between different privilege level rings. On the other hand, there are many explanations that a lot of modern operating systems don't bother using hardware context switches due to the performance not being even comparable with that of software context switch.
If you would need to setup a TSS and simply neglect many fields in it and only taking care of ESP0, SS0, and EIP, then you are touching the hardware context switch, aren't you?
2. It is advised to have user mode and kernel mode stacks per each task. When you switch from user mode to kernel mode, you only switch from user mode task stack to the kernel mode task stack. If an interrupt or an exception happens, then you go from kernel mode task stack to your main kernel stack. Is it the way it works?
What if I simply would not have a kernel mode task stack? Once an interrupt preempted the kernel handling my task, both IRQ handler and task handler share the same stack, therefore, everything would be safe, true?
Best regards.
Iman.
I'm experimenting with user mode to kernel mode and vice versa switches. There are some pitfalls that I would like to ask and hopefully get some answers.
The processor is in protected mode.
1. It shows setting a valid TSS is necessary for the switch. If I simply inactive the TSS, any trial to change from user mode to kernel mode give me a #GP.
So literally, you would need such a TSS, in first place, to be able to switch between different privilege level rings. On the other hand, there are many explanations that a lot of modern operating systems don't bother using hardware context switches due to the performance not being even comparable with that of software context switch.
If you would need to setup a TSS and simply neglect many fields in it and only taking care of ESP0, SS0, and EIP, then you are touching the hardware context switch, aren't you?
2. It is advised to have user mode and kernel mode stacks per each task. When you switch from user mode to kernel mode, you only switch from user mode task stack to the kernel mode task stack. If an interrupt or an exception happens, then you go from kernel mode task stack to your main kernel stack. Is it the way it works?
What if I simply would not have a kernel mode task stack? Once an interrupt preempted the kernel handling my task, both IRQ handler and task handler share the same stack, therefore, everything would be safe, true?
Best regards.
Iman.
-
- Member
- Posts: 5572
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Task switching difficulties between user and kernel
It's not a hardware task switch if you're not using a task gate.iman wrote:If you would need to setup a TSS and simply neglect many fields in it and only taking care of ESP0, SS0, and EIP, then you are touching the hardware context switch, aren't you?
The Intel manual, volume 3A section 7.3, lists the four situations where a hardware task switch occurs.
That depends on how you set up the IDT for your interrupt handlers. If you use interrupt gates, they don't switch to a different stack.iman wrote:If an interrupt or an exception happens, then you go from kernel mode task stack to your main kernel stack. Is it the way it works?
It's possible to use only one stack in the kernel, but I think it makes task switching more difficult. (It's one stack per CPU once you support multiple CPUs.)iman wrote:What if I simply would not have a kernel mode task stack?
As long as the stack has enough space, yes.iman wrote:Once an interrupt preempted the kernel handling my task, both IRQ handler and task handler share the same stack, therefore, everything would be safe, true?
-
- Member
- Posts: 426
- Joined: Tue Apr 03, 2018 2:44 am
Re: Task switching difficulties between user and kernel
You just need SS0 and ESP0. EIP will come from the interrupt/trap gate.iman wrote:Hi.
I'm experimenting with user mode to kernel mode and vice versa switches. There are some pitfalls that I would like to ask and hopefully get some answers.
The processor is in protected mode.
1. It shows setting a valid TSS is necessary for the switch. If I simply inactive the TSS, any trial to change from user mode to kernel mode give me a #GP.
So literally, you would need such a TSS, in first place, to be able to switch between different privilege level rings. On the other hand, there are many explanations that a lot of modern operating systems don't bother using hardware context switches due to the performance not being even comparable with that of software context switch.
If you would need to setup a TSS and simply neglect many fields in it and only taking care of ESP0, SS0, and EIP, then you are touching the hardware context switch, aren't you?
More than likely, you'll have a single SS shared for all tasks (same as DS, ES etc.), so it'll only be ESP0 that will change when you switch tasks in software. As part of your software task switch, you'll just set the TSS.ESP0 for the current processor (each processor will need its own TSS.)
You're just using the TSS to hold the kernel stack pointer for the switch to supervisor mode. All the other aspects of the TSS, including the hardware task switching, can be ignored.
If you get an interrupt while already in kernel mode, you won't switch stacks. You'll just build on the existing kernel stack, do the interrupt handling, and iret back to where you were before in kernel mode. The TSS is only consulted for a new stack for *changes* in privilege level.iman wrote: 2. It is advised to have user mode and kernel mode stacks per each task. When you switch from user mode to kernel mode, you only switch from user mode task stack to the kernel mode task stack. If an interrupt or an exception happens, then you go from kernel mode task stack to your main kernel stack. Is it the way it works?
What if I simply would not have a kernel mode task stack? Once an interrupt preempted the kernel handling my task, both IRQ handler and task handler share the same stack, therefore, everything would be safe, true?
Yes, you can switch stacks if you configure the interrupt as a task gate, but that would require that hardware task switching you want to avoid. This might be useful, for example, to handle faults that would relate to stack overflow events (in kernel mode, if you've overflowed the stack, how can you handle faults?)
The only other use for the TSS is the I/O port permission bitmap. I don't use this myself, but it's probably not too onerous to maintain this in software for processes that need it.
Re: Task switching difficulties between user and kernel
Now I see the border between hardware and software task switch.Octocontrabass wrote:The Intel manual, volume 3A section 7.3, lists the four situations where a hardware task switch occurs.
They are interrupt gates.Octocontrabass wrote:That depends on how you set up the IDT for your interrupt handlers. If you use interrupt gates, they don't switch to a different stack.
Last edited by iman on Tue Sep 08, 2020 5:34 am, edited 1 time in total.
Re: Task switching difficulties between user and kernel
I have, say, three AP cpus and I must set up four (BSP and APs) separate TSS. If the number of processor cores are even higher, then the same number of TSS is required.thewrongchristian wrote:As part of your software task switch, you'll just set the TSS.ESP0 for the current processor (each processor will need its own TSS.)
For curiosity: does it mean if I have a user mode task which owns the same stack as the main kernel, the cpu, by checking the TSS.ESP0, safely switches?thewrongchristian wrote:The TSS is only consulted for a new stack for *changes* in privilege level.
-
- Member
- Posts: 426
- Joined: Tue Apr 03, 2018 2:44 am
Re: Task switching difficulties between user and kernel
If you mean the kernel stack corresponding to the user mode process, what do you mean by "main kernel"? Do you envisage a seperate "main kernel" thread? Then no, they can't share stacks, especially if you have multiple CPUs which may be executing them concurrently.iman wrote:For curiosity: does it mean if I have a user mode task which owns the same stack as the main kernel, the cpu, by checking the TSS.ESP0, safely switches?thewrongchristian wrote:The TSS is only consulted for a new stack for *changes* in privilege level.
If you mean can user processes somehow share a single kernel stack (per CPU), then yes. In fact, I believe some minimal microkernels do just that (you can check that in L4, for example), if all the kernel is doing is passing messages between user processes, or doing some privileged operation on behalf of a user process. Then, a context switch is just a case of pointing to the corresponding user context to restore upon exit from a syscall/interrupt, in which case you would only have to update the TSS to change the I/O permission bitmap for the next process, if required.
But it'd mean your kernel cannot sleep other than to idle waiting for the next interrupt.
Re: Task switching difficulties between user and kernel
Yes it was what I had in mind as an abstract example.thewrongchristian wrote:If you mean can user processes somehow share a single kernel stack (per CPU), ...
Re: Task switching difficulties between user and kernel
I'm rather curious exactly how the CPU handles this. The Intel programming guide only says:thewrongchristian wrote:If you get an interrupt while already in kernel mode, you won't switch stacks. You'll just build on the existing kernel stack, do the interrupt handling, and iret back to where you were before in kernel mode.
"If a stack switch occurred when calling the handler procedure, the IRET instruction switches back to the interrupted procedure’s stack on the return."
However, it doesn't specify HOW it decides whether or not to restore the stack.
Given that you could have the following:
- user mode process generates a fault (stack switch to pl0 stack)
- exception handler experiences a fault (say, a page fault) (no stack switch)
- page fault handler experiences another fault (also no stack switch)
(note that this sequence is NOT a double fault)
How does the IRET know whether to restore the stack or not? I assume it performs a privilege check of the saved CS (that's probably why CS/EIP is always on top) but if you were using the same segment for user and system code, CS will be the same. Is it tracking the exception "depth" so that the "oldest" return gets a stack switch?
Re: Task switching difficulties between user and kernel
It goes by the available data. What data is available to tell it whether or not a stack switch has happened? The interrupt frame only consists of CS, EIP, and EFLAGS. Which of these could identify whether a stack switch took place?sj95126 wrote:However, it doesn't specify HOW it decides whether or not to restore the stack.
CS of course. A stack switch only (and always) happens when escalating to a higher level of privilege, so if a stack switch happened, the CS in the interrupt frame will belong to a lower privilege level. So when IRET sees that the new CS is of lower privilege, it knows to also look for stack information.
In that case, the outermost stackframe will have the user's CS and the inner ones will have the kernel's CS.sj95126 wrote:Given that you could have the following:
- user mode process generates a fault (stack switch to pl0 stack)
- exception handler experiences a fault (say, a page fault) (no stack switch)
- page fault handler experiences another fault (also no stack switch)
(note that this sequence is NOT a double fault)
How does the IRET know whether to restore the stack or not?
That is impossible. User and kernel CS must differ in the DPL so the RPL can be adjusted accordingly. If the kernel's CS is loaded, it must be loaded with RPL 0, and a successfully loaded CS RPL is the CPL. If you run user code with CPL 0, you just gave up all hardware protection. Also, if your user code does run at CPL 0, then even the outermost exception you showed here causes no stack switch.sj95126 wrote: I assume it performs a privilege check of the saved CS (that's probably why CS/EIP is always on top) but if you were using the same segment for user and system code, CS will be the same.
Honestly, I don't remember how exactly DPL, RPL, and CPL work. All I do know is that you must have a user CS with a DPL of 3 and load it with an RPL of 3, in order to get both a CPL of 3 and no GPF. And since the DPL must be different, so must be the CS.
Carpe diem!
Re: Task switching difficulties between user and kernel
I figured that was the case, but you have to admit it's very unlike Intel not to specify that in nauseating detail. The 10-volume combined programmer's guide is over 5,000 pages. Sometimes they repeat the same conceptual sequence dozens of times.nullplan wrote:It goes by the available data.sj95126 wrote:However, it doesn't specify HOW it decides whether or not to restore the stack.
Especially considering a decision based on CPL vs. RPL vs. DPL, where they usually go into an almost dizzying decision tree, I was very surprised they don't outline how this decision is made. "Isn't it obvious?" is not their usual way of things.
Re: Task switching difficulties between user and kernel
Indeed it is Intel's occasional tendency to witter on a bit that made me lose all memory of how protected mode rings work in detail. However, if you want nauseating detail, look at the description of IRET, which answers your question in the pseudo-code section: https://www.felixcloutier.com/x86/iret:iretd
So it was the RPL that makes it switch stacks!
So it was the RPL that makes it switch stacks!
Carpe diem!