OSDev.org

Posted: **Sat Jul 23, 2016 10:00 pm**

I am currently coding a simple scheduler for my os.
In my understanding, the typical way to implement scheduling is like this:

Timer interrupt calls scheduler, scheduler does context switch, which saves registers from last process A and loads register values of the next process B.

My question is, typically, how does the context switch function get the register values of the last process A since all the current register values are not the same as those before process A gets interrupted by timer interrupt.

Thanks in advance

Posted: **Sat Jul 23, 2016 11:43 pm**

Hi,

szhou42 wrote:In my understanding, the typical way to implement scheduling is like this:

Timer interrupt calls scheduler, scheduler does context switch, which saves registers from last process A and loads register values of the next process B.

Typically most tasks block (because they need to waiting for a time delay or data from disk or networking or keyboard or IPC or something) and unblock (because what they were waiting for happened); and this blocking/unblocking is responsible for most task switching. The timer is only there for the cases where something doesn't block first and can be left as after-thought (or not done at all if you want a pure "highest priority thread that can run does run" system).

szhou42 wrote:My question is, typically, how does the context switch function get the register values of the last process A since all the current register values are not the same as those before process A gets interrupted by timer interrupt.

Normally you'd have a "switch_to(thread)" function that can only be called by other kernel code, that does the actual task switch. For this function the registers and stack always look the same (regardless of what called it).

Note that you should probably have:

A thread state for each thread (saying if it's "running", "ready to run", or blocked for some reason).
The "switch_to(thread)" function that does nothing if the thread being switched to is currently running; or (otherwise):
- saves the currently running thread's register contents, etc
- checks if the currently running thread's state is "running" and if it is, puts the currently running thread back into the "ready to run" state", and puts it on whatever data structure the scheduler uses to keep track of "ready to run" threads
- sets the state of the thread being switched to to "running", and loads its register contents, etc.
A "find_thread_to_switch_to()" function that chooses a thread to switch to (using whatever data structure the scheduler uses to keep track of "ready to run" threads), removes the selected thread from the scheduler's data structure, and then calls the "switch_to(thread)" function.
A "block_thread(reason)" function which sets the currently running thread's state to whatever the reason is, then calls the "find_thread_to_switch_to()" function.
An "unblock_thread(thread)" function which sets a thread's state to "read to run", then decides if the thread being unblocked should preempt the currently running thread. If the thread being unblocked shouldn't preempt then it puts the thread on whatever data structure the scheduler uses to keep track of "ready to run" threads; and if the thread being unblocked should preempt it calls the "switch_to(thread)" function instead (to cause an immediate task switch, without bothering with the unnecessary overhead of the "find_thread_to_switch_to()" function).

I would recommend implementing these in the order listed; and testing that each piece works before worrying about the next piece. For example; with "switch_to(thread)" and nothing else you could test it by having a pair of kernel threads that ""switch_to()" each other.

After all of that is implemented, tested and working; then worry about the timer IRQ handler. The timer IRQ handler would check if the currently running task has used all the time it was given, and if it has it would check there are other tasks the CPU could be doing, and if there are other tasks that could be running it'd call the "find_thread_to_switch_to()" function.

Cheers,

Brendan

Posted: **Sun Jul 24, 2016 2:59 am**

szhou42 wrote:My question is, typically, how does the context switch function get the register values of the last process A since all the current register values are not the same as those before process A gets interrupted by timer interrupt.

All interrupt handlers are responsible for preserving the CPU registers, or saving them and restoring them, before calling IRET. The timer interrupt is no different.

So, to answer your question, if your interrupt handler needs to change any registers, it must first save their values to the stack, and restore them before calling IRET, regardless of whether it calls your context switch function or not.

As for your context switch function, itself, it must store the values of all CPU registers before it changes anything. The only register that will definitely change is the instruction pointer (EIP), but that register is already saved on the stack when your interrupt handler is called (and again when your context switch function is called).

As a matter of fact, if you push all registers to the stack in your context switch function, the only register you need to store for that particular process is the stack pointer (ESP). Pretty cool.

Posted: **Sun Jul 24, 2016 10:44 am**

Normally you'd have a "switch_to(thread)" function that can only be called by other kernel code, that does the actual task switch. For this function the registers and stack always look the same (regardless of what called it).

But how do I get the registers from last process (before it blocks/interrupted by timer) ?
Do I get it from register info pushed to the stack when interrupts happened, like SpyderTL suggested?

Posted: **Sun Jul 24, 2016 6:30 pm**

szhou42 wrote:But how do I get the registers from last process (before it blocks/interrupted by timer) ?
Do I get it from register info pushed to the stack when interrupts happened, like SpyderTL suggested?

There is no "register info" pushed to the stack when an interrupt happens.

You have to push any registers that you need to modify to the stack in your interrupt handler, yourself. This goes for all interrupt handlers, not just the timer interrupt handler.

The only information (automatically, by the CPU) pushed to the stack when an interrupt happens is the EIP register, the SS register, and the EFLAGS register. The rest, you have to store on the stack, yourself, if you want to change them.

If you want to swap to a new process, you are going to need to push them all to the stack, and then store the ESP register value somewhere safe, so that you can restore those registers the next time you want to swap back to the current process.

(Another option would be to store all of the CPU registers in a process structure, instead of pushing them to the stack. It's up to you which way to go.)

To answer your question as simply as I can, if your timer interrupt handler is going to call your context swap function, and if it has modified any registers, it first needs to make sure that those registers have been restored to the values they were when the interrupt handler was started.

Posted: **Mon Jul 25, 2016 9:38 pm**

Hi,

szhou42 wrote:
Normally you'd have a "switch_to(thread)" function that can only be called by other kernel code, that does the actual task switch. For this function the registers and stack always look the same (regardless of what called it).
But how do I get the registers from last process (before it blocks/interrupted by timer) ?
Do I get it from register info pushed to the stack when interrupts happened, like SpyderTL suggested?

In the time between the interrupt occurring and the "switch_to(thread)" function being called, the thread's state has changed - the thread switched to CPL=0, the segment registers probably got replaced by "kernel segments", everything that was in general purpose registers is has very likely been replaced by other values, etc. You don't want to store "state of the thread in the past", you want to store "state of the thread now".

Note that "state of the thread now" is always the thread's state in kernel-space. All the user-space state would already be on the stack somewhere, and will be restored when you return to user-space the same as it would be if you didn't do a task switch.

Also:

because "state of the thread now" is always the thread's state in kernel-space and the kernel's segment registers are typically always the same for all threads; you don't need to save any segment registers and don't need to load most segment registers (you may need to load a maximum of one segment register if it's used for TLS).
if the "switch_to(thread)" function is called from C, then the ABI for C guarantees some registers are "preserved by caller" so you don't need to save/restore them either. For example, for 32-bit 80x86 the registers EBX, ESI, EDI and EBP are "preserved by caller" and don't need to be saved/restored.
CR3 shouldn't/couldn't have changed so you don't need to save it, but (if the thread being switched to belongs to a different process) you will need to load CR3
the CPU is designed specifically to allow you to postpone saving and/or loading the FPU/MMX/SSE/AVX state until it's actually used (and avoid saving it if it wasn't used, and avoid loading it if it wasn't changed since the same thread used it last). If you're using this feature; you wouldn't save or restore FPU/MMX/SSE/AVX state in the "switch_to(thread)" function either.

Taking all that into account; a minimal "switch_to(thread)" function for 32-bit 80x86 C calling conventions can be less than 15 instructions.

Finally; when you create a new thread the code that creates the thread has to prepare the new thread's kernel stack to match what the "switch_to(thread)" function expects to remove from the new thread's kernel stack. For this reason the "switch_to(thread)" function should be written in pure assembly - otherwise (e.g. if it's done as "C with inline assembly") the C compiler is free to do whatever it likes with the stack and it's impossible for the code that creates a new thread to match the "unpredictable, whatever the C compiler felt like" stack layout.

Cheers,

Brendan

Posted: **Tue Jul 26, 2016 12:03 am**

If you are not changing any segments, what is the benefit of using IRET over using RET in a switch_to method ? just to restore EFLAGS?

Posted: **Tue Jul 26, 2016 4:28 am**

Hi,

Boris wrote:If you are not changing any segments, what is the benefit of using IRET over using RET in a switch_to method ? just to restore EFLAGS?

I don't use IRET for the "switch_to(thread)" function (and I'm not sure anyone else does either).

Cheers,

Brendan

OSDev.org

Context switch question

Context switch question

Re: Context switch question

Re: Context switch question

Re: Context switch question

Re: Context switch question

Re: Context switch question

Re: Context switch question

Re: Context switch question