Extraneous IRQ

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
FlashBurn
Member
Member
Posts: 313
Joined: Fri Oct 20, 2006 10:14 am

Extraneous IRQ

Post by FlashBurn »

I don´t know if it would be better to spawn a new thread.

What would you do if you get an irq, but no driver says it was my device? I mean should I mask the irq so that the source of this irq could not slow down the system or should I just do nothing? The problem with masking would be that all other devices which use the same irq would be not usable anymore.

So how is this solved when you do not have a driver for a device which is sending irqs on and on?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: synchronous ipc

Post by Combuster »

FlashBurn wrote:So how is this solved when you do not have a driver for a device which is sending irqs on and on?
While this is a whole different subject, you can tell which devices are connected to an interrupt based on the PCI configuration. You can dedicate an interrupt line for unknown devices, and then assign the ones you do know to one of the other IRQs, although it shouldn't be necessary as devices don't normally send interrupts until you configure them.

Even so, if you find an IRQ to have no origin, you can determine which subset of physical devices are responsible and by elimination determine the offending device (pci is normally level triggered, so an unhandled IRQ will keep coming back), and restart the driver or lock down the device where it won't bother you.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Extraneous IRQ

Post by Owen »

Of course, a misbehaved device can hold a PCI interrupt line low and you can't do anything to stop it, and, really, is it worth your time to prevent a problem which is pretty much non-existent?
FlashBurn
Member
Member
Posts: 313
Joined: Fri Oct 20, 2006 10:14 am

Re: Extraneous IRQ

Post by FlashBurn »

So I can ignore this problem or move all devices for which I have no driver into one irq which I can then mask.

Another question is, what is better, wait till all drivers looked if it was their device and then send the eoi or send the eoi when I reach my irq handler?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Extraneous IRQ

Post by Combuster »

Owen wrote:Of course, a misbehaved device can hold a PCI interrupt line low and you can't do anything to stop it, and, really, is it worth your time to prevent a problem which is pretty much non-existent?
Obviously, a device is broken if it shortcircuits the interrupt line off. A device that tells it wants to interrupt the host is however not broken by definition, but it does cause the mentioned problem.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Extraneous IRQ

Post by Owen »

Combuster wrote:
Owen wrote:Of course, a misbehaved device can hold a PCI interrupt line low and you can't do anything to stop it, and, really, is it worth your time to prevent a problem which is pretty much non-existent?
Obviously, a device is broken if it shortcircuits the interrupt line off. A device that tells it wants to interrupt the host is however not broken by definition, but it does cause the mentioned problem.
Who said anything about short circuiting? A PCI device requests an interrupt from the processor by holding one of the interrupt lines low. The motherboard has a pull-up resistor in order to hold it high when no interrupt is being requested. A device holding the line low is just requesting an interrupt.

Of course, it's still broken if it does it when the host hasn't configured it.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Extraneous IRQ

Post by Combuster »

Oh well, must have misread somewhere when I thought that PCI interrupts were level triggered, active high... My bad. #-o

Anyway, care saves you from the following cases (in order of probability):
  • Broken or crashed drivers that can't turn off the interrupt signal
  • Devices configured by the firmware giving off an interrupt later, or have the pending interrupt line masked off
  • Devices raising an interrupt without intervention
  • Faulty hardware
So even if we wipe out the points covered by Owen's and my previous arguments, we're still left with the most probable cause still in place... :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Extraneous IRQ

Post by Brendan »

Hi,

Did anyone notice that there's an "interrupt disable" flag in the "device control register" (in the device's PCI configuration space)?

My suggestion is, when enumerating devices disable all of them that you can (including setting the "interrupt disable" flag and clearing the bits that control the device's ability to respond to I/O space accesses, respond to memory space accesses and act as a bus master). Then, when you start device drivers you re-enable these things for each device.

You could also disable the device if the device driver crashes or causes IRQ floods or is "unloaded" for any other reason.

In this case, if you get an IRQ but none of the device drivers claims it, then I'd be tempted to ignore the IRQ (maybe the IRQ disappeared for some reason). If the IRQ didn't disappear for some reason then ignoring the IRQ would cause an IRQ flood, and your "IRQ flood detection" could notice and kill the device drivers using that IRQ, one by one (while disabling the devices) until the IRQ flood stops (until you know which device driver was borked). Then you'd be able to restart the "innocent" device drivers again.

Also note that my normal "device detection" advice involves manually probing for old ISA devices (if necessary) after all PCI devices have been disabled (but before any of them have been enabled again), to minimise the chance of conflicts/problems during the manual probing.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Extraneous IRQ

Post by Combuster »

Brendan wrote:Did anyone notice that there's an "interrupt disable" flag in the "device control register" (in the device's PCI configuration space)?
No I didn't, and in my copy (v2.2), it isn't there - neither the flag nor the register, which makes that approach unusable if the hardware isn't compatible.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Extraneous IRQ

Post by Owen »

Combuster wrote:Oh well, must have misread somewhere when I thought that PCI interrupts were level triggered, active high... My bad. #-o

Anyway, care saves you from the following cases (in order of probability):
  • Broken or crashed drivers that can't turn off the interrupt signal
  • Devices configured by the firmware giving off an interrupt later, or have the pending interrupt line masked off
  • Devices raising an interrupt without intervention
  • Faulty hardware
So even if we wipe out the points covered by Owen's and my previous arguments, we're still left with the most probable cause still in place... :wink:
If you look at the pinouts, they tend to be called "INTA#", etc. A signal name postfixed by a hash or a lower case n, prefixed with a /, or with an overbar is probably active low. As you can tell, there are lots of different conventions for this!

As for why active low: In general, an output transistor has historically been able to pull low stronger than it can high
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Extraneous IRQ

Post by Brendan »

Hi,
Combuster wrote:
Brendan wrote:Did anyone notice that there's an "interrupt disable" flag in the "device control register" (in the device's PCI configuration space)?
No I didn't, and in my copy (v2.2), it isn't there - neither the flag nor the register, which makes that approach unusable if the hardware isn't compatible.
You're right - the "interrupt disable" flag didn't exist in the PCI Local Bus Specification Revision 2.2 (December 18, 1998), or in any older version of the specification. It does exist in PCI Local Bus Specification Revision 2.3 (March 29, 2002) and all newer versions of the specification (as far as I can tell).

This would imply that:
  • For PCI 2.3 or later you can disable the device's ability to generate IRQs.
  • For PCI 2.2, if the device supports MSI and you're using I/O APIC you can enable MSI and configure it to generate a very low priority interrupt (which disables the device's ability to generate IRQs using it's "INTx# pin") and then never send an EOI for the MSI (which prevents the device's MSI from generating more than one interrupt).
  • For PCI 2.2 devices that don't support MSI (or devices that do support MSI when an I/O APIC isn't being used), and for older devices (which don't support MSI or the interrupt disable), you'd need to mask the IRQ line at the PIC or I/O APIC and kill all devices that share that IRQ line.
Of course it's possible to have a mixture of (old, newer and new) PCI cards sharing the same interrupt line. This complicates things a little more. If you assume that the OS uses MSI to avoid IRQ sharing whenever possible, then you can disable (non-MSI) IRQs on newer devices first to see if they're causing the IRQ flood, and only mask the IRQ in the PIC or I/O APIC if it wasn't a newer device causing the IRQ flood.

Also, it should be possible to have a "device manager" that handles all of this (without needing each device driver to support it) - the device driver/s would only need a more generic "unload the device driver" feature. This is important because dodgy device drivers are probably the most common reason for IRQ floods.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply