Page 1 of 1
Scheduler masking interrupts
Posted: Mon May 06, 2013 1:38 am
by sukhusukhu
I am trying to run the scheduler on IRQ0 (PIT). How does the linux kernel handle this situation?
When the interrupt handler for IRQ0 causes a context switch from task A to task B and returns? What if task A was already interrupted by IRQ6 when IRQ0 arrives and causes a context switch..
Should the scheduler allow the IRQ6 to finish? if not,
What should happen after the context switch? should the interrupt be allowed to finish? or is it ignored?
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 2:22 am
by rdos
The scheduler should not be allowed to make a task-switch from active IRQs (excluding the preemption IRQ). Even for the preemption IRQ, it must first clear the interrupt in the interrupt controller before switching task.
There are different ways of achieving this. Linux and Windows supposedly use upper/lower halves in their rather complex interrupt handlers. My OS uses a simplified method that has a counter (per core) that indicates the interrupt nesting, and as soon as this counter is zero, and a task-switch is pending, it will be executed.
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 3:25 am
by sukhusukhu
rdos wrote:The scheduler should not be allowed to make a task-switch from active IRQs (excluding the preemption IRQ). Even for the preemption IRQ, it must first clear the interrupt in the interrupt controller before switching task.
There are different ways of achieving this. Linux and Windows supposedly use upper/lower halves in their rather complex interrupt handlers. My OS uses a simplified method that has a counter (per core) that indicates the interrupt nesting, and as soon as this counter is zero, and a task-switch is pending, it will be executed.
But IRQ0 is the highest priority interrupt. Do you clear the PIT interrupts till your counter for other interrupts reaches zero?
Also, if I am inside an interrupt handler for a keyboard interrupt and IRQ0 sends a tick to the scheduler function to context switch. What should happen in this case? (I can't mask interrupts if I am inside an interrupt handler for the keyboard/ any other device because I will loose the ability to receive multiple interrupts)
Thanks
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 4:10 am
by gravaera
Yo:
Don't think of the system timer IRQs as "causing context switches". Prefer to think of them as "invoking the pre-emptive scheduler", which may or may not decide that a thread switch is in order.
Next, if your scheduler is designed well, there is no reason why IRQ nesting should cause a problem for the pre-emptive scheduler trying to switch the active thread on its CPU. RDOS already addressed this: invoke the pre-emptive scheduler only after all nested IRQs have exited, and only if the preempt_count for the CPU in question is 0.
--Peace out,
gravaera
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 4:14 am
by Combuster
Two things: interrupt handlers shouldn't take the time so that interrupts actually get lost - if you get two IRQs while you're handling another one, you'll only be seeing one of them (and probably break things). Disabling interrupts during the handling of an IRQ does not necessarily be an issue because of the severely limited time they are supposed to have anyway, and it can prove to be a good tool to prevent issues from nesting.
You have two basic alternatives: serialize IRQ handling by disabling interrupts so the whole nesting issue becomes moot, or be able to postpone events to "safe" points, like upon return to userspace - where futureproofing tends to require the latter.
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 4:49 am
by rdos
I prefer to have the safe points in the exit code of IRQ handling stubs (combining them with decrementing the nesting counter), as having them in the return code to user-space means that the kernel basically is not preemptable, which I regard as a bad idea.
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 5:01 am
by Brendan
Hi,
sukhusukhu wrote:rdos wrote:My OS uses a simplified method that has a counter (per core) that indicates the interrupt nesting, and as soon as this counter is zero, and a task-switch is pending, it will be executed.
This is what I normally do too.
sukhusukhu wrote:But IRQ0 is the highest priority interrupt. Do you clear the PIT interrupts till your counter for other interrupts reaches zero?
Imagine if the IRQ handler for IRQ6 does (something like) this:
Code: Select all
IRQ6_hander:
inc dword [task_switch_disable] ;Increase the number of task switch disables
sti
;Normal IRQ handling stuff here
cli
;Send EOI to PICs here
dec dword [task_switch_disable] ;Decrease the number of task switch disables
jne .done ;Done if task switches still disabled
cmp byte [task_switch_postponed],0 ;Was a task switch postponed?
je .done ; no
call do_task_switch ; yes, do the postponed task switch
.done:
iret
Now, your scheduler might do (something like) this:
Code: Select all
try_task_switch:
pushfd
cli
cmp dword [task_switch_disable],0 ;Is task switches disabled?
jne do_task_switch ; no, can do the task switch now
mov byte [task_switch_postponed],1 ; yes, postpone the task switch
popfd
ret
do_task_switch:
; Do the task switch
If any IRQs are in progress, and *anything* attempts to cause a task switch, then that task switch is postponed until all IRQ handlers have finished.
sukhusukhu wrote:Also, if I am inside an interrupt handler for a keyboard interrupt and IRQ0 sends a tick to the scheduler function to context switch. What should happen in this case? (I can't mask interrupts if I am inside an interrupt handler for the keyboard/ any other device because I will loose the ability to receive multiple interrupts)
IRQ0 would be almost the same as the code for IRQ6 - nothing special.
Don't forget that almost all IRQ handlers can cause task switches. For example, imagine a task that is blocked waiting for disk IO (or keyboard, or...) - the IRQ occurs, the IRQ handler does something that unblocks the waiting task, and the (now unblocked) new task preempts the currently running task.
Also don't forget that normal kernel functions cause task switches too. For example, imagine a task that calls "sleep()" or needs data from disk or something - the task can't continue so it blocks and you end up with nothing to do, so you have to do a task switch to find some other task to run.
This is why I like the method above - you can use it for everything, not just IRQs handlers.
Cheers,
Brendan
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 6:19 am
by gravaera
Yo:
There is no meaningful reason not to support nested IRQs in your kernel. If you design your IRQ handling correctly, you won't lose out on IRQs from any device by nesting into another handler while in IRQ context.
First of all, IRQ routing should be set up such that devices have priorities assigned to their IRQs. For the legacy IBM-PC IRQ routing scheme, this was hardcoded into the i8259 pins. For the IO-APIC routing subsystem, your kernel should assign IRQ priorities based on the device in question. For example, depending on the throughput goals of your kernel, you may decide to give the IRQ pin that hosts a network card's IRQ a higher priority than say, a disk device's DMA controller IRQ. So if an IRQ comes in while you are servicing another, it should, if you set up your IRQ correctly, be a higher priority device in the first place.
The whole point of nested IRQs is to ensure that higher priority IRQs are serviced in preference to low priority IRQs. Secondly, for level triggered IRQs, only one device can be interrupting on the pin at once. You won't "miss" IRQs from other devices sharing the pin. For edge triggered IRQs, due to
Microsoft's efforts (since 2001!) manufacturers are less likely to have edge triggered devices sharing pins. Even so, as long as you are handling the higher priority device's IRQ (which you will be), when you get back to the lower priority device, it will all be "correct".
At any rate having nested IRQs is better than masking IRQs locally at the CPU while handling a low priority IRQ such that higher priority IRQs are completely ignored is more likely to cause you to "miss" IRQs than a proper, nested approach.
--Peace out,
gravaera
Re: Scheduler masking interrupts
Posted: Mon May 06, 2013 11:28 am
by rdos
Brendan wrote:
Imagine if the IRQ handler for IRQ6 does (something like) this:
Code: Select all
IRQ6_hander:
inc dword [task_switch_disable] ;Increase the number of task switch disables
sti
;Normal IRQ handling stuff here
cli
;Send EOI to PICs here
dec dword [task_switch_disable] ;Decrease the number of task switch disables
jne .done ;Done if task switches still disabled
cmp byte [task_switch_postponed],0 ;Was a task switch postponed?
je .done ; no
call do_task_switch ; yes, do the postponed task switch
.done:
iret
Now, your scheduler might do (something like) this:
Code: Select all
try_task_switch:
pushfd
cli
cmp dword [task_switch_disable],0 ;Is task switches disabled?
jne do_task_switch ; no, can do the task switch now
mov byte [task_switch_postponed],1 ; yes, postpone the task switch
popfd
ret
do_task_switch:
; Do the task switch
If any IRQs are in progress, and *anything* attempts to cause a task switch, then that task switch is postponed until all IRQ handlers have finished.
Yes, this is really useful, but I don't implement it that way. In my OS, it is invalid for IRQs to cause task-switches, which includes any potentially blocking operation. If they need to synchronize, IRQs must use spinlocks.
Instead I have a syscall pair called signal/waitforsignal. Signal, which can be used in IRQs, will simply set a bit in the task-control block, and unblock the thread if it is waiting for a signal putting it in a special "schedule-these-tasks" queue that is checked when the IRQ nesting reaches zero. The WaitForSignal operation is guaranteed to never be blocked after signal has been called, regardless if the signal is set before, simultaneously, or after the call to WaitForSignal. This is really the only thing needed for IRQs to wake-up servers waiting for IO. It also works on multicore.
Because this is the main problem in basically all server thread configurations. The server-thread might be running on one core, and is ready with it's operations and starts to block, when the IRQ occurs on another core, and finds the thread still running. In this situation, the server thread will become blocked until the next IRQ occurs, which might be indefinitely.