Too many IRQs from RTL8139

thepowersgang · Post by **thepowersgang** » Fri May 18, 2012 8:40 am

@NickJohnson

A solution to this problem that I think QNX uses (this is from the perspective of a driver developer).
The driver registers an interrupt handler, that is called essentially during interrupt context (or definitely with that interrupt masked / not yet EOI'd)
This handler operates in a very restricted environment, and can return a structure that asks the kernel to signal a helper process to do the heavy lifting outside of interrupt context.

This should solve the recurring interrupt issue.

gerryg400 · Post by **gerryg400** » Fri May 18, 2012 9:00 am

NickJohnson wrote:
gerryg400 wrote:I mean interrupts that might be shared between 2 devices. Say the kernel sends the event to 2 drivers because it doesn't know which one actually caused the interrupt. It shouldn't EOI until each has acked its event.
Each driver should be able to determine whether its device was the one that generated the interrupt, and then only reset the IRQ if it is supposed to. Even in the event of rogue IRQ resets, the interrupt line should stay high because one of the devices hasn't been dealt with, so the IRQ will be immediately regenerated.

Let's say the kernel sends an event to 2 tasks A and B. And let's suppose that both interrupts have actually occurred and the interrupt line to the PIC is high.
If A clears his interrupt and unmasks the interrupt before B has cleared his, a new pair of interrupt event will immediately be sent, one to each task. And then when B services his interrupt and unmasks another pair of events will be generated.

NickJohnson · Post by **NickJohnson** » Fri May 18, 2012 10:03 am

gerryg400 wrote:
NickJohnson wrote:
gerryg400 wrote:I mean interrupts that might be shared between 2 devices. Say the kernel sends the event to 2 drivers because it doesn't know which one actually caused the interrupt. It shouldn't EOI until each has acked its event.
Each driver should be able to determine whether its device was the one that generated the interrupt, and then only reset the IRQ if it is supposed to. Even in the event of rogue IRQ resets, the interrupt line should stay high because one of the devices hasn't been dealt with, so the IRQ will be immediately regenerated.
Let's say the kernel sends an event to 2 tasks A and B. And let's suppose that both interrupts have actually occurred and the interrupt line to the PIC is high.
If A clears his interrupt and unmasks the interrupt before B has cleared his, a new pair of interrupt event will immediately be sent, one to each task. And then when B services his interrupt and unmasks another pair of events will be generated.

The system doesn't actually generate events per se. It only wakes up threads that are waiting for the IRQ to be set. From an interface standpoint, the effect is the same as if all threads were concurrently polling for the interrupt line to be set high, just with much more efficient semantics. You can of course easily transform a "wait for event" system into a "send event" system on a per-listener basis.

Combuster · Post by **Combuster** » Fri May 18, 2012 2:22 pm

Note that with shared interrupt lines, it's typically easy to predict which device is responsible for the interrupt - the one that dealt with the previous one. That heuristic can save you from waking irrelevant threads but needs a bit more care - especially in microkernel environments where it is more tricky to get the "Guilty as charged" message back from them.

gerryg400 · Post by **gerryg400** » Fri May 18, 2012 4:15 pm

Consider this situation.

The kernel notices an interrupt and notes that it is a shared interrupt. It masks the interrupt and does an EOI.

The 2 tasks poll and are both told that an interrupt has occurred.

Some time later, (with the interrupt line into the PIC still high because A finished before B), A makes an API that unmasks the interrupt and returns to the top of his processing loop polling for the next interrupt,

Because B hasn't finished A's polling is successful and he's told that another interrupt has occurred. If B were to take a while to service his interrupt then A might be called several times.

Worse if A and B are at different priorities, A might prevent B from ever clearing the interrupt line.

NickJohnson · Post by **NickJohnson** » Fri May 18, 2012 9:14 pm

I admit that sort of scenario could happen. However, it doesn't seem to be that bad of a situation (unless one driver is of a higher priority than the other.) Both drivers are not going to be waiting on IRQs simultaneously in most cases, and the latency for each driver is probably pretty similar, minimizing the number of extra wakeups; on the other hand, the worst that can happen is that one of the drivers effectively sleep/polls for its device to be ready, which means poor performance, but not a system hang or crash. For my current purposes, that's good enough.

gerryg400 · Post by **gerryg400** » Fri May 18, 2012 9:55 pm

For my current purposes, that's good enough.

In fact, good enough's always good enough.

jbemmel · Post by **jbemmel** » Fri May 18, 2012 10:33 pm

Related to this topic: in my kernel I create a separate descriptor for use by interrupt handlers. It covers the same space as the regular CS descriptor, but because it has a different value, it is easy to check if code is running in an interrupt context (by checking CS selector)

OSDev.org

Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139

Re: Too many IRQs from RTL8139