Page 1 of 3

Context switch on timer interrupt

Posted: Wed Jul 05, 2017 5:16 am
by steamboat
Hi there,

I have a question about context switches on timer interrupts (x86_64), which makes me confused.
In the theory I thought as follows:
  • On timer interrupt, the os switches the registers (rbx,r12,..,r15,rbp,rsp) from the current thread to the next thread (save & restore)
  • after that, the interrupt handler calls iret to return from the routine
  • the thread, switched by the os while interrupt handling, will run
My context structure looks like:

Code: Select all

struct ContextStack
{
    void *rbx;
    void *r12;
    void *r13;
    void *r14;
    void *r15;
    void *rbp;
    void *rsp;
    char fpu[108];
};

class Context 
{
public:
    ContextStack *stack;
    // some more attributes, that doesn't matter
}

and my switch looks like:

Code: Select all

save_and_restore:
	mov [rdi+0x0],  rbx
	mov [rdi+0x8],  r12
	mov [rdi+0x10], r13
	mov [rdi+0x18], r14
	mov [rdi+0x20], r15
	mov [rdi+0x28], rbp
	mov [rdi+0x30], rsp

	mov rbx, [rsi+0x0]
	mov r12, [rsi+0x8]
	mov r13, [rsi+0x10]
	mov r14, [rsi+0x18]
	mov r15, [rsi+0x20]
	mov rbp, [rsi+0x28]
	mov rsp, [rsi+0x30]

	mov rdi, rdx

	ret
The save_and_restore routine is called in C++ by

Code: Select all

void save_and_restore (struct ContextStack *current_stack, ContextStack *next_stack, void *current_context);
where current_context is a pointer to the Context instance, which is currently running.

What me confuses: I'm logging some text, after calling the save_and_restore routine within the timer interrupt handler, but that is not executed. It seems like the context is switched but the handler never finishes. Also after the first context switch, I'll never receive any timer interrupts again. Shouldn't the log be executed and the switch done after calling iret?
Here is what my handler looks like:

Code: Select all

void on_timer_interrupt()
{
    log("starting timer interrupt handler"); // I see this
    // do some work...
    Context *current_context = // get the current
    Context *next_context = // dequeue the next context to run

    enqueue_to_ready_list(current_context);
    save_and_restore(current_context->stack, next_context->stack, current_context);
    enable_inerrupts();
    log("timer interrupt done"); // this is never logged
    // call iret
}
Thanks and greetings :-)

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 5:38 am
by iansjack
Why do you save just some registers, not all of them?

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 6:21 am
by sebihepp
If you use the PIT as Timer, then you need to read Status Register C once every interrupt, the PIT generates. Just think of it like the EOI you have to send to the PIC, but instead this is done via reading a port. ^^

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 7:24 am
by bzt
First, you need to save and restore ALL registers for a context switch. This also includes xmm,sse etc. registers if you use float or double in your C++ code. Second, you'll need two separate functions, one for saving, one for restoring them. On timer interrupt, you should:
1. save context
2. do other stuff in isr
3. switch to new task (change cr3 or the pointer to the registers save area)
4. restore context
5. iret
If you cannot do 3. (for example when only one task is running), then 1. and 4. will save and restore the state of the same task. If you have more tasks, then 4. will load a state that will be saved by 1. on the NEXT interrupt.

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 8:16 am
by iansjack
To add to your 3, it is common to have a dummy task that is always runnable in case all other tasks are blocked. This should, obviously, run at a lower priority than all other tasks and can be used to do background maintenance.

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 10:28 am
by BrightLight
sebihepp wrote:If you use the PIT as Timer, then you need to read Status Register C once every interrupt, the PIT generates. Just think of it like the EOI you have to send to the PIC, but instead this is done via reading a port. ^^
Just to correct this statement in case it's relevant in the future, this statement is valid for the CMOS and not the PIT. The PIT has no "status register C", and you only need this on the CMOS RTC.

Re: Context switch on timer interrupt

Posted: Wed Jul 05, 2017 1:51 pm
by bzt
iansjack wrote:To add to your 3, it is common to have a dummy task that is always runnable in case all other tasks are blocked. This should, obviously, run at a lower priority than all other tasks and can be used to do background maintenance.
Absolutely correct! That dummy task is often called "idle".

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 9:59 am
by Boris
bzt wrote:First, you need to save and restore ALL registers for a context switch. .
while this being very accurate, you dont need to save them all in a structure. Because your structure is probably used in C functions, and calling a function means following a call convention and letting your compiler save some registers for you ( usually in the stack)
See preserved registers of your API.
Even if your doSwitch( currentRegisters, targetRegisters) is in assembly, any C compiler will wrap the call with some push and pops.

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 10:21 am
by iansjack
Why would it do that? As far as GCC knows that assembler function doesn't use any registers.

It seems to me to be messy to save some registers in a structure and some on the stack.

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 10:32 am
by bzt
Boris wrote:
bzt wrote:First, you need to save and restore ALL registers for a context switch. .
while this being very accurate, you dont need to save them all in a structure. Because your structure is probably used in C functions, and calling a function means following a call convention and letting your compiler save some registers for you ( usually in the stack)
See preserved registers of your API.
Even if your doSwitch( currentRegisters, targetRegisters) is in assembly, any C compiler will wrap the call with some push and pops.
Yes, you do need to save them all. Maybe the ABI saves some of the registers, but there's no guarantee that the timer interrupt happens when you do a call. It can happen in the middle of your function as well, when registers are being used and not saved. Consider this:

Code: Select all

mylittlefunction:
push rbp
mov rbp, rsp
mov rax, qword [rbp+16]
----interrupt happens here---
mov rdx, qword [rbp+8]
mov rcx, qword [rbp+24]
mul rcx
If you don't save for example rax, the function would not calculate the correct multiplication and would cause interesting results for the same input when called several times, depending it was interrupted or not. It's a bug I don't wish even for my worst enemies to debug.

I'd like to point out for you that saving a task's state has nothing to do with ABI or function calls or whatsoever. They are totally different kind of things. You can't have any assumptions on what a task is doing when an interrupt is fired (hence the name, interrupt). It's the ISRs responsibility to save and restore the state so that the task wouldn't know at all it was interrupted.

EDIT: now I'm thinking you may refer to save registers on interrupt handler's stack. Not good either, because you have to save registers BEFORE you can do any C call, meaning save/restore must be written in assembly without C ABI; not to mention that there's no guarantee that task's stack is the same as ISR's stack. For example my kernel uses a safe stack for interrupts (three to be more precise, one for debug interrupt, one for irq interrupts and one for nmi). I never save interrupt's return address on task's local stack.

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 3:00 pm
by steamboat
For example my kernel uses a safe stack for interrupts (three to be more precise, one for debug interrupt, one for irq interrupts and one for nmi)
Just for interest: How do you switch the stack before an interrupt occurs?

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 3:03 pm
by iansjack
In long mode you can specify a number of stacks in the tss. Then in the IDT you specify which stack a particular interrupt will use. See Vol. 3 of the Intel manual for full details.

Re: Context switch on timer interrupt

Posted: Thu Jul 06, 2017 4:20 pm
by Sik
iansjack wrote:Why would it do that? As far as GCC knows that assembler function doesn't use any registers.
Not to mention that different languages are likely using different ABIs, and you don't want to hardcode an OS to a single language.

Re: Context switch on timer interrupt

Posted: Fri Jul 07, 2017 7:47 am
by bzt
steamboat wrote:Just for interest: How do you switch the stack before an interrupt occurs?
Yes, iansjack was right, I use ISTs in TSS.
https://www.kernel.org/doc/Documentatio ... nel-stacks
x86_64 also has a feature which is not available on i386, the ability
to automatically switch to a new stack for designated events such as
double fault or NMI
They are wrong about nesting though:
Switching to the kernel interrupt stack is done by software based on a
per CPU interrupt nest counter. This is needed because x86-64 "IST"
hardware stacks cannot nest without races.
I simply use a sub/add pair to rearrange new stack for the next interrupt and voilá, race conditions gone.
I suppose it's a hungarian thing, we have an annoying habbit to do things that the rest of the world thinks impossible... :lol:

Re: Context switch on timer interrupt

Posted: Fri Jul 07, 2017 8:10 am
by LtG
bzt wrote:
steamboat wrote:Just for interest: How do you switch the stack before an interrupt occurs?
Yes, iansjack was right, I use ISTs in TSS.
https://www.kernel.org/doc/Documentatio ... nel-stacks
x86_64 also has a feature which is not available on i386, the ability
to automatically switch to a new stack for designated events such as
double fault or NMI
They are wrong about nesting though:
Switching to the kernel interrupt stack is done by software based on a
per CPU interrupt nest counter. This is needed because x86-64 "IST"
hardware stacks cannot nest without races.
I simply use a sub/add pair to rearrange new stack for the next interrupt and voilá, race conditions gone.
I suppose it's a hungarian thing, we have an annoying habbit to do things that the rest of the world thinks impossible... :lol:
Wrt to IST, you only support long mode?

Of course, 386 also supports this, thru hardware task switching, though it comes with performance penalties.

And they aren't wrong about nesting, they said there's race conditions, and there are. That doesn't mean the race conditions can't be dealt with (with the possible exception of NMI-SMI-NMI, which has its own thread here on the forum).

bzt, what happens to your code if another IRQ occurs before your "sub/add pair" and the same IST entry is used, thus the new IRQ will overwrite the stack before you saved it? Also why do you use "cli" as your first instruction in the ISR?