Re: How do modern OSs handle the CPU regs of multiple proces
Posted: Tue Aug 09, 2022 2:28 am
Thanks for sharing the interesting detail. So the handlers (or part of them) have to be identity mapped due to running cross MMU on/off boundary? I wonder whether this is because of the fact that the architecture was from a period where most CPUs don't have MMUs? The new power arch (like POWER 10) probably doesn't have the same limitation anymore?nullplan wrote:It is the same on many architectures. My beloved PowerPC has "rfi" (return from interrupt), and that instruction is used twice for each syscall and interrupt. Because someone decided that all exceptional conditions ought to turn off the MMU, so the first-level exception handlers have to find a kernel stack to write their info to and then use "rfi" to turn on the MMU again and transition to the kernel handler in the same instruction. And then afterwards it uses "rfi" to return to the user space code.xeyes wrote:I guess people tend to say return because of the flexiblility of IRET (and to a lesser degree, far return) on x86.
However, this is a detail. I tend to see execution on a CPU as something in which userspace resides on the topmost stack frame. Any interrupt or syscall transitions to kernel space and pushes the outermost stack frame onto the kernel stack (on PowerPC this just happens in software while the MMU is off, but that is incidental), and finally the CPU will return back to userspace. Scheduling means switching stacks, and the lowlevel switch function does nothing but saving the non-volatile registers before doing so. On thread exit the same thing happens, except the task is marked with a flag that means it will never be scheduled in again. And on startup, initialization does partly entail constructing an initial task stack. The funny thing is, you never need to clean up the stack entirely when running the userspace task for the first time: Just construct an IRET frame and perform an IRET (or maybe just SYSRET), and next time the kernel is called, stack starts over at the place named in the TSS.
Why do you need to clean it up ever? Is it because of using some fast syscall instructions that don't swich stack during a syscall? Otherwise if there's always a stack switch, the user level code won't be able to look at it so there's no need to clean up right?nullplan wrote:The funny thing is, you never need to clean up the stack entirely when running the userspace task for the first time