Confused about context switch

nullplan · Post by **nullplan** » Sun Oct 15, 2023 11:55 am

KrotovOSdev wrote:So I can't write interrupt handlers on pure C? Then I'll use assembly.

Perhaps I should clarify that I mean the interrupt entry and exit code only. The actual handler you can write in C. I use Linux's naming convention where I have an assembler label, e.g. "divide_error", that handles exception entry for IDT entry 0, and then a C function do_divide_error(), that actually handles it. The assembler only saves all registers, does whatever else may be needed, calls the C function, restores the registers, does whatever may be needed on interrupt exit (e.g. delivering signals) and irets.

I just saw something in the CPU manual I had overlooked before: In 64-bit mode, the CPU will align the stack pointer in hardware to a 16-byte boundary when starting the interrupt process. So I don't need to do that myself.

KrotovOSdev wrote:Do I have to check ip esp is larger than my stack limit or what?

Typically, if you implement nested interrupts, you check if the stack pointer is more than some limit away from the stack bottom before enabling interrupts again. This gets more complicated if you have more stacks with different sizes, but basically, that is it. That of course requires you to be able to find the current stack limit.

rdos · Post by **rdos** » Mon Oct 16, 2023 1:19 am

Octocontrabass wrote:
KrotovOSdev wrote:
rdos wrote:You should also save ALL registers in your TCB, not just some. Register context should not be saved on the (interrupt) stack as this is begging for problems later when voluntary context switches are added.
Aren't all context switches voluntary? Why would saving registers on the stack be a problem? As long as you don't allow context switches inside nested interrupt handlers, you won't have to worry about accessing the nested context.

At least my synchronization primitives are all based on kernel code blocking, and therefore, there is no interrupt context on the stack that can be used. Rather, the blocking code expects to return to a point in kernel space, not in user space. I support kernel threads, and kernel can be preempted, so my scheduler needs to handle task switches & blockings in kernel space, and when the actions originate from user space, that's just a special case that doesn't need specific handling.

Also, by saving registers on the kernel stack, you will consume more stack space, particularly with nested interrupts. It's also more convenient for debugging if task state is saved in the TCB.

rdos · Post by **rdos** » Mon Oct 16, 2023 1:27 am

nullplan wrote:
KrotovOSdev wrote:Do I have to check ip esp is larger than my stack limit or what?
Typically, if you implement nested interrupts, you check if the stack pointer is more than some limit away from the stack bottom before enabling interrupts again. This gets more complicated if you have more stacks with different sizes, but basically, that is it. That of course requires you to be able to find the current stack limit.

Hmm, maybe I should use that. I have some rare cases when the network gets overloaded, which causes the kernel stack to overflow and the kernel to panic. I'm not sure if it is the network chip that is not clearing the interrupt causes, and thus gets into an interrupt loop, or if it is actually because the buffer gets full. Anyway, the RTL8169-compatible chips respond to overload by stopping the controller, something that is quite problematic when it needs to be restarted on an overloaded network. Intel's i2-chips instead drop packets, which is a lot better.

Octocontrabass · Post by **Octocontrabass** » Mon Oct 16, 2023 11:30 am

rdos wrote:At least my synchronization primitives are all based on kernel code blocking, and therefore, there is no interrupt context on the stack that can be used. Rather, the blocking code expects to return to a point in kernel space, not in user space.

All context switches involve a return to kernel space. The interrupt context is only necessary for tasks that have been interrupted.

rdos wrote:Also, by saving registers on the kernel stack, you will consume more stack space, particularly with nested interrupts.

It's definitely a problem with nested interrupts. However, if you don't allow nested interrupts, it's a simple tradeoff between stack size and TCB size.

rdos wrote:It's also more convenient for debugging if task state is saved in the TCB.

Isn't debugging usually done through interrupts? It doesn't make much difference where the registers are saved in that case, either way they'll all be in one place.

rdos wrote:I have some rare cases when the network gets overloaded, which causes the kernel stack to overflow and the kernel to panic.

Linux had the same issue and solved it by disallowing nested interrupts. Interrupt handlers that want to run interruptible tasks have to schedule those tasks to run elsewhere.

rdos · Post by **rdos** » Mon Oct 16, 2023 1:05 pm

Octocontrabass wrote:
rdos wrote:At least my synchronization primitives are all based on kernel code blocking, and therefore, there is no interrupt context on the stack that can be used. Rather, the blocking code expects to return to a point in kernel space, not in user space.
All context switches involve a return to kernel space. The interrupt context is only necessary for tasks that have been interrupted.

Exactly, so why build it in other cases?

Octocontrabass wrote:
rdos wrote:It's also more convenient for debugging if task state is saved in the TCB.
Isn't debugging usually done through interrupts? It doesn't make much difference where the registers are saved in that case, either way they'll all be in one place.

I unwind the stack for exceptions so the TCB points to code that should be executed next. So, if user-space is interrupted, the TCB will point to userspace, and not to kernel space. I then can trace (single step) code both in kernel mode and user-mode, and the TCB register state will reflect the real state of the code being debugged, and not a pointer to a stack location. This also works for V86 mode. For voluntary schedules, the TCB will point to the point where the thread is resumed, and I can trigger a resume in the kernel debugger and follow the restart of the code from where it continues. I can use the application debugger (Watcom) to do the same, including tracing into kernel code.

Octocontrabass · Post by **Octocontrabass** » Mon Oct 16, 2023 1:21 pm

rdos wrote:Exactly, so why build it in other cases?

Who said anything about building an interrupt context when there is no interrupt?

rdos · Post by **rdos** » Mon Oct 16, 2023 1:26 pm

Octocontrabass wrote:
rdos wrote:Exactly, so why build it in other cases?
Who said anything about building an interrupt context when there is no interrupt?

You must save register state of threads regardless if they are preempted or not. If you save registers on the stack, then you need to do that regardless of why the thread is blocked. And the code that returns to a thread must see a consistent return stack, or load registers from TCB.

Octocontrabass · Post by **Octocontrabass** » Mon Oct 16, 2023 2:01 pm

All context switching is performed by calling a function that switches stacks. If a thread is preempted, the function is called in the interrupt handler. Since it's an ordinary function call, it only needs to preserve registers that an ordinary function call isn't allowed to modify. In the System V i386 psABI, those registers are EBX, ESI, EDI, EBP, and ESP. The return stack is always consistent because context switches can't happen outside of that function.

rdos · Post by **rdos** » Tue Oct 17, 2023 1:46 am

Octocontrabass wrote:All context switching is performed by calling a function that switches stacks. If a thread is preempted, the function is called in the interrupt handler. Since it's an ordinary function call, it only needs to preserve registers that an ordinary function call isn't allowed to modify. In the System V i386 psABI, those registers are EBX, ESI, EDI, EBP, and ESP. The return stack is always consistent because context switches can't happen outside of that function.

That's assuming the context switching takes place in C code, but all of my context switching calls are in assembly, and my general "ABI" rule is that a function must save ALL registers it uses, including segment registers DS, ES, FS and GS.

So, what you are saying is that you return to the function with random registers, and this works because your compiler has some rules that tell you which registers that might be modified. So, when you move to a new compiler or change the rules, your code will break in subtle ways. This method makes it impossible to decide the register state of blocked and ready to run threads, and a kernel debugger will be unable to show register state because this depends on how a thread was blocked.

My initial context switching design used hardware task switching, and it's implicit in hardware task switching that all registers are saved & loaded on context switches, and it provides a good tool to inspect task state (registers). When I stopped using hardware task switching, I created a save method that saved registers in the TSS at the same points, and a load method that loaded registers from the TSS. Since then I have added all the 64-bit long mode registers to the TCB too, so I can save all kind of state in the TCB. I prefer to integrate the debugger with the scheduler for the best user experience. That means that context switching will save registers in a consistent manner so a debugger can show them on demand. Besides, in order to support both kernel and userspace debugging, registers must be possible to save per task, and the most convenient place to save them in is in the TCB.

Octocontrabass · Post by **Octocontrabass** » Tue Oct 17, 2023 11:07 am

rdos wrote:That's assuming the context switching takes place in C code,

It doesn't matter which code calls the context switch function as long as it follows the ABI.

rdos wrote:my general "ABI" rule is that a function must save ALL registers it uses, including segment registers DS, ES, FS and GS.

Nobody uses segmentation except you.

rdos wrote:So, when you move to a new compiler or change the rules, your code will break in subtle ways.

Several compilers support the System V ABI, so it's entirely possible to move to a new compiler without breaking anything. Changing to a different ABI involves rewriting all assembly code, so it doesn't matter how subtly it breaks.

rdos wrote:This method makes it impossible to decide the register state of blocked and ready to run threads, and a kernel debugger will be unable to show register state because this depends on how a thread was blocked.

It might be more difficult, but it's far from impossible. If a thread wasn't blocked by an interrupt, set a breakpoint on the first instruction that will run when it resumes to interrupt it.

rdos · Post by **rdos** » Tue Oct 17, 2023 1:00 pm

Octocontrabass wrote:
rdos wrote:So, when you move to a new compiler or change the rules, your code will break in subtle ways.
Several compilers support the System V ABI, so it's entirely possible to move to a new compiler without breaking anything. Changing to a different ABI involves rewriting all assembly code, so it doesn't matter how subtly it breaks.

Not at all. I define all syscalls by registers, and I'm sure more than OpenWatcom can define external functions that uses specific registers for parameters. Actually, I know this is possible with GCC too. Part of these definitions are which registers are modified. No need for any general C language ABI when you can define functions by registers. I do this in other interfaces between C/C++ and assembler too. It's the most convinient method for mixing assembly with C/C++.

Besides, most thread blockings originate in user space, and you cannot assume the ABI in user space is the same as in kernel space, so you end up needing to save all registers in the interface anyway. At least if you don't want to limit which compiler you must use in user space. You also need to validate all parameters so user space cannot modify code or data in kernel space. At least when user space uses a flat selector that spans kernel too.

Octocontrabass · Post by **Octocontrabass** » Tue Oct 17, 2023 2:30 pm

rdos wrote:No need for any general C language ABI when you can define functions by registers.

In other words, you've defined your own ABI. If you change your ABI, you need to rewrite any assembly code that uses your ABI.

rdos wrote:Besides, most thread blockings originate in user space, and you cannot assume the ABI in user space is the same as in kernel space, so you end up needing to save all registers in the interface anyway.

Right. That means you don't need to save all registers during the context switch.

rdos wrote:You also need to validate all parameters so user space cannot modify code or data in kernel space.

What does that have to do with context switching?

rdos · Post by **rdos** » Wed Oct 18, 2023 1:32 pm

Octocontrabass wrote:
rdos wrote:No need for any general C language ABI when you can define functions by registers.
In other words, you've defined your own ABI. If you change your ABI, you need to rewrite any assembly code that uses your ABI.

I cannot change the syscall ABI, since that would break applications. I only add new entries and never modify old ones. So, no, there never will be any need to rewrite that.

Octocontrabass wrote:
rdos wrote:Besides, most thread blockings originate in user space, and you cannot assume the ABI in user space is the same as in kernel space, so you end up needing to save all registers in the interface anyway.
Right. That means you don't need to save all registers during the context switch.

My design works the opposite way. Syscalls are implemented with call gates, and go directly to the destination procedure. Therefore, registers are not normally saved on the stack. The kernel side typically only uses a few registers, but since the user space ABI requires registers to be preserved, it's convinient to save all registers in the TCB when a task is blocked.

Octocontrabass · Post by **Octocontrabass** » Wed Oct 18, 2023 7:54 pm

rdos wrote:I cannot change the syscall ABI,

But that's separate from the context switching function's ABI. (And you're only stuck with your current syscall ABI because you didn't force applications to dynamically link against library code to perform syscalls.)

rdos wrote:My design works the opposite way. Syscalls are implemented with call gates,

Nobody uses call gates except you.

rdos wrote:The kernel side typically only uses a few registers,

Is the kernel side written in assembly or C?

KrotovOSdev · Post by **KrotovOSdev** » Thu Oct 19, 2023 2:16 pm

Now my interrupt handlers entry points look like this

Code: Select all

global sched_time_handler
extern do_timer_int

sched_time_handler:
    cli
    ;pushad
    push ebp
    call do_timer_int
    ;popad
    sti
    iret

global kernel_pf_handler
extern do_page_fault

kernel_pf_handler:
    cli
    pop ebx
    pushad
    push ebx
    call do_page_fault
    popad
    sti
    iret

I think the problem is stack overflow. When interrupt handler is called it calls resched() which leads to creating a stack frame. If I'm right (but i maybe not) I have somehow to bypass it. How can I do this?

OSDev.org

Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch

Re: Confused about context switch