Receiving unnecessary interrupt from Intel 8254x

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 10:24 am

After modifying my interrupt system to properly handle level-triggered interrupts, and catching unhandled interrupts, I have an occasional bug (possibly related to some sort of race condition?) with the Intel 8254x NIC driver. Once in a while, the following sequence of events occurs (the Intel 8254x is the only device connected to that interrupt line):

1. The NIC receives a packet, sets the "packet received" bit in its ICR (interrupt cause register), and triggers an intterupt.
2. The interrupt handler is invoked and interrupts disabled (IF=0).
3. The interrupt handler reads the ICR, which also automatically sets it to zero. At this point, therefore, the device should stop sending an interrupt.
4. The interrupt is handled.
5. As soon as I enable interrupts, I get ANOTHER interrupt from the NIC. This time, ICR is zero. Therefore, the NIC interrupt handler reports that the interrupt does not come from the NIC; no other devices are on the interrupt line, so the interrupt remains unhandled, and so the kernel panics.

Why do I get an interrupt in step (5) (only sometimes), when step 3 should have prevented the interrupt from coming?
Am I missing something in this interrupt handling procedure?

davidv1992 · Post by **davidv1992** » Sat Apr 29, 2017 12:26 pm

At what point does your code send an interrupt acknowledge to the interrupt controller?

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 12:57 pm

davidv1992 wrote:At what point does your code send an interrupt acknowledge to the interrupt controller?

Right after ICR has been read.

davidv1992 · Post by **davidv1992** » Sat Apr 29, 2017 1:06 pm

If (and only if) this is physical hardware, it could simply be that the interrupt controller is picking up on the bouncing of the physical interrupt line carrying the signal (hardware isn't perfect). You could try to do a short wait between reading the ICR and acknowledging the interrupt to the controller to see if that helps.

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 1:09 pm

It's in VirtualBox.

And how would I do this waiting sanely? It seems that waiting a specific period of time in an interrupt handler with interrupts disabled would be a performance disaster.

davidv1992 · Post by **davidv1992** » Sat Apr 29, 2017 1:15 pm

If it is in virtualbox, then bouncing/ringing of the interrupt line cannot be the cause.

If it were on physical hardware, I would do the short wait either through a very short busy loop (but be carefull to keep the compiler from optimizing it out), or (if availble), using the timestamp counter to get the same effect. Both will indeed impact performance and are not optimal, but effective for diagnosing if that is the case. Long term, I would actually consider just ignoring (or logging) the spurious interrupt.

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 1:20 pm

OK i guess i'll make the kernel go through the status register of every PCI device in this case, and if they're all clear, assume spurious and ignore.

Brendan · Post by **Brendan** » Sat Apr 29, 2017 2:26 pm

Hi,

When you get an unnecessary IRQ from the NIC; what was the cause of the previous (necessary) IRQ?

Note: There's multiple different "packet received" IRQs (e.g. "small receive packet detected", "receive descriptor minimum threshold reached", "receive timer interrupt") where it's possible to get 2 or more causes of an interrupt at once.

I'm thinking something like:

1. The NIC receives a packet, sets the "packet received" bit in its ICR (interrupt cause register), and triggers an intterupt.
2. The interrupt handler is invoked and interrupts disabled (IF=0).
3a. The interrupt handler reads the ICR, which also automatically sets it to zero. At this point, therefore, the device should stop sending an interrupt.
3b. The NIC realises that at least one of the conditions that caused the IRQ is still present, and sets the bit in ICR again
3c. You send EOI to the PIC or IO APIC before you've finished handling the actual cause of the IRQ
3d. PIC or IO APIC tries to send the new/second IRQ to the CPU, but IRQs are disabled at the CPU so it gets postponed
4. The interrupt is handled.
4a. The ICR bit gets reset to zero because whatever condition/s that caused the IRQ are gone now
5. As soon as you enable interrupts, you get the interrupt that was postponed at "step 3d".

mariuszp wrote:OK i guess i'll make the kernel go through the status register of every PCI device in this case, and if they're all clear, assume spurious and ignore.

There's no need to check drivers that aren't using that IRQ.

Note that the driver itself shouldn't know or care if the kernel felt like using PIC or IO APIC or MSI; and the driver itself also shouldn't know or care if the IRQ is being shared by other PCI devices or not. If the same IRQ actually was being shared by other PCI devices then your driver would broken (it would send an EOI even when the IRQ was intended for a different device driver that happens to be sharing the same IRQ). In other words, your driver should probably be considered broken simply because it assumes the IRQ is not shared.

In general; the kernel would have a list of none, one or more drivers that are using an IRQ. When that IRQ occurs the kernel would notify the drivers in the list one at a time. A driver would try to handle the IRQ and return some kind of "it was/wasn't my device" status back to the kernel. If the device returned "it was my device" then the kernel stops notifying any other drivers in that list and sends EOI. If the device returned "it was not my device" then kernel notifies the next driver in the list (if any).

If none of the drivers in the list say their device caused the IRQ, then the kernel assumes something is wrong. I'd be tempted to mask that IRQ and tell any/all drivers that they are being terminated. However, I'd also be tempted to use a "number of times in a row that this IRQ occurred for no reason" counter, which is reset to zero whenever any driver in the list says their device did cause the IRQ and incremented when none of them say their device caused the IRQ. If this counter reaches some threshold when it's incremented (maybe 3 unnecessary IRQs in a row?) then mask the IRQ and tell all the drivers they're being terminated. This gives some leniency while still guarding against IRQ floods.

Cheers,

Brendan

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 2:36 pm

The necessary IRQ was the reception of a packet. In the IRQ handler, I just read the ICR and postpone the handling of the actual packet (which is a more complex process) to another thread; that thread however does NOT have time to read the packet in anyway, because as soon as interrupts are enabled, the next interrupt arrives, and ICR is zero (AFAIK, the bits are sticky, they don't get cleared just becuase "the condition stoped being true"). If I ignore this interrupt and send EOI anyway, nothing breaks.

Other than that, I follow exactly the process you just described (mainly because you suggested it in another thread to me and I implemented it).

Strangely enough, however, the "interrupt status" bit in the PCI status register is NEVER set.

EDIT: To clarify, after this one spurious interrupt, no more spurious ones arrive (in a row) but it changes to sending one good one, followed by one spurious. But even after that, the driver has enough time to service interrupts and the OS remains stable.

LtG · Post by **LtG** » Sat Apr 29, 2017 3:41 pm

If I understood correctly, the moment you IRET you get a second interrupt? Have you verified it's always exactly after IRET? If it is, then it seems unlikely to be a coincidence and that means the IRET causes the next interrupt. For that to be the case it would likely be because IRET unmasks interrupts, which would mean that the interrupt likely happened earlier (as Brendan mentioned), which would either be due to EOI or not fully servicing the device.

I assume it's easy to disable the EOI sending from the driver? What happens if you do that? For example if you accidentally sent EOI in two places and not just one, it might be that the first EOI is sent prior to servicing the device and thus triggering a new interrupt to be latched on the PIC.

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 4:07 pm

LtG wrote:If I understood correctly, the moment you IRET you get a second interrupt? Have you verified it's always exactly after IRET? If it is, then it seems unlikely to be a coincidence and that means the IRET causes the next interrupt. For that to be the case it would likely be because IRET unmasks interrupts, which would mean that the interrupt likely happened earlier (as Brendan mentioned), which would either be due to EOI or not fully servicing the device.

I assume it's easy to disable the EOI sending from the driver? What happens if you do that? For example if you accidentally sent EOI in two places and not just one, it might be that the first EOI is sent prior to servicing the device and thus triggering a new interrupt to be latched on the PIC.

The driver does not send EOI, and indeed the IRET enables the interrupts.

The driver just tells the kernel if it confirms that the interupt came from its device; and THEN kernel sends EOI (apic->eoi = 0) and returns.

Brendan · Post by **Brendan** » Sat Apr 29, 2017 4:41 pm

Hi,

mariuszp wrote:The necessary IRQ was the reception of a packet. In the IRQ handler, I just read the ICR and postpone the handling of the actual packet (which is a more complex process) to another thread; that thread however does NOT have time to read the packet in anyway, because as soon as interrupts are enabled, the next interrupt arrives, and ICR is zero (AFAIK, the bits are sticky, they don't get cleared just becuase "the condition stoped being true"). If I ignore this interrupt and send EOI anyway, nothing breaks.

Other than that, I follow exactly the process you just described (mainly because you suggested it in another thread to me and I implemented it).

I doubt I have ever suggested sending EOI before the IRQ is actually handled; and you are sending EOI before the IRQ is actually handled. More specifically; you send EOI after you read ICR but before your thread has handled the actual cause of the device's IRQ. If you sent EOI at the right time (after your thread has told the kernel it has finished handling the cause of the IRQ) it'd be impossible for a second/unnecessary IRQ to occur before your thread starts.

Note: For gigabit Ethernet (where something else might cause an IRQ while it's handling an IRQ) I'd be tempted to do something like:

Code: Select all

    do {
        wait_or_whatever();   // block this thread (until something unblocks this thread)
        if(IRQ_occured) {     // if the thread was unblocked because an IRQ occured
            do {
                handle_cause(lastICR);
                lastICR = ICR;
            } while(lastICR != 0);
            tell_kernel_finished(status);  // Tell kernel I finished handling the cause of the IRQ

        } else if {
            // Handle other things that could've unblocked the thread
        }
    } while(running);       // Go back to waiting (unless driver was terminated somewhere)

mariuszp wrote:Strangely enough, however, the "interrupt status" bit in the PCI status register is NEVER set.

If you're using MSI then I don't think that bit is used. If you're inside an emulator then that bit probably isn't emulated. If it's on real hardware then it's likely to be PCI-E where that bit only indicates that a "legacy PCI interrupt message" is pending and hasn't been sent yet.

Cheers,

Brendan

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 4:44 pm

So, it is safe to return from the interrupt handler and allow the driver thread to handle the interrupt and THEN let it send EOI?

EDIT: Note however, that servicing that interrupt basically involves reading something the controller put in memory and then updating a tial/head register.

Brendan · Post by **Brendan** » Sat Apr 29, 2017 5:01 pm

Hi,

mariuszp wrote:So, it is safe to return from the interrupt handler and allow the driver thread to handle the interrupt and THEN let it send EOI?

EDIT: Note however, that servicing that interrupt basically involves reading something the controller put in memory and then updating a tial/head register.

For "no IRQ sharing", yes, that's technically safe.

For "no IRQ sharing or IRQ sharing" you'd return from the interrupt handler, then allow the driver thread to handle the interrupt and tell the kernel it finished the interrupt (and that its device was responsible for the IRQ), and then let kernel send EOI (after it has been told something was responsible for the IRQ).

Cheers,

Brendan

mariuszp · Post by **mariuszp** » Sat Apr 29, 2017 5:09 pm

Hang on, but the EOI needs to be sent to the LAPIC... and I have SMP, so the thread may run on a different CPU (so with a different LAPIC).

OSDev.org

Receiving unnecessary interrupt from Intel 8254x

Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x

Re: Receiving unnecessary interrupt from Intel 8254x