Hey everyone,
I'm having the issue on the real hardware only, can't reproduce this problem in emulators and that is as soon as I unmask IRQ's belonging to INTA, INTB, INTC and INTD pins, I get flooded with IRQ's and my kernel basicly locks up (everything takes forever to process because it gets interrupted all the time).
At first I thought; hey I don't disable irqs in the PCI Control, so I went ahead and made sure I disabled pci-devices ability to generate interrupts by writing 0x400 to pci command register while doing my device enumeration, and then only unmasking devices for which I have drivers. However this made no difference, as it seems they don't respond to the disabling. So now I'm at loss, I use ACPICA for device enumeration (and pci routing), so I was looking into the ACPI method _DIS (disable), however I see no way to enable them again?
Anyone have any thoughts?
PCI devices and irq flooding
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: PCI devices and irq flooding
PCI interrupts are level-triggered by default, so if you don't silence the device causing it, they will retrigger the moment you acknowledge the interrupt. PCI 2.3 has an option to force disable interrupts, in other cases you can try to disable the device or alter the interrupt line to see if you can identify the device in question.
Re: PCI devices and irq flooding
Yes exactly, you are correct, however I do force disable interrupts by writing 0x400 to the command register (bit 10 is interrupt disable), and they still seem to occur, I should mention im using the I/O apic and thus the interrupt_line should not be relevant, right?
My problem is I don't know which device is producing the interrupts
My problem is I don't know which device is producing the interrupts
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: PCI devices and irq flooding
That bit was only added in PCI 2.3 and isn't universally supported. You can change a device around INTA-INTD and see if the interrupt number changes and isolate it that way.
Re: PCI devices and irq flooding
I'll try to isolate the interrupt then when I get home, but I don't think it's because PCI 2.3 is not supported, because I accidentally disabled pci-interrupts for the VGA device through BIT 10 on the PCI bus and my drawing routine stopped working :p
Re: PCI devices and irq flooding
Hi,
If you've disabled the ability to generate IRQs in the devices themselves and still get an IRQ flood; then maybe you've got them configured as "level triggered active high" instead of "level triggered active low" in the IO APIC (which would cause the IO APIC to think there's an IRQ whenever there isn't one).
For disabling devices; I'd recommend that early during boot you:
Finally, I'd be tempted to put sanity checks in place such that, in case of hardware failures (and/or driver bugs), the OS will detect IRQ flooding and (if/when detected) forcibly disable the effected IO APIC input's IRQ, disable effected device/s (write 0x00000000 to the devices' Device Control register), unload/terminate any effected device's driver, mark the device/s as "potentially faulty" (in whatever the OS uses to track the state of each PCI device) and inform the user. To detect flooding, you'd probably need to track the amount of time between sending the EOI and receiving the next IRQ; and either increment a "back to back IRQs" counter (if the time was below a threshold) or zero the "back to back IRQs" counter (if the time was greater than the threshold). Then, if the "back to back IRQs" counter exceeds a max. value (e.g. 1000 "back to back IRQs" in a row) you've detected an IRQ flood.
Cheers,
Brendan
The "interrupt line" field in PCI configuration space is irrelevant when using IO APICs (it's only relevant when using PIC chips).MollenOS wrote:Yes exactly, you are correct, however I do force disable interrupts by writing 0x400 to the command register (bit 10 is interrupt disable), and they still seem to occur, I should mention im using the I/O apic and thus the interrupt_line should not be relevant, right?
If you've disabled the ability to generate IRQs in the devices themselves and still get an IRQ flood; then maybe you've got them configured as "level triggered active high" instead of "level triggered active low" in the IO APIC (which would cause the IO APIC to think there's an IRQ whenever there isn't one).
For disabling devices; I'd recommend that early during boot you:
- Mask every "PIC IRQ" and disable every "IO APIC input" when first configuring the PIC and IO APIC/s; and install the "spurious IRQ" handlers (2 for PIC and one for each IO APIC)
- Write 0x00000000 to every device's Device Control register. This should disable the devices completely, regardless of whether they're PCI 2.3 (with the "interrupt disable" flag) or not.
Finally, I'd be tempted to put sanity checks in place such that, in case of hardware failures (and/or driver bugs), the OS will detect IRQ flooding and (if/when detected) forcibly disable the effected IO APIC input's IRQ, disable effected device/s (write 0x00000000 to the devices' Device Control register), unload/terminate any effected device's driver, mark the device/s as "potentially faulty" (in whatever the OS uses to track the state of each PCI device) and inform the user. To detect flooding, you'd probably need to track the amount of time between sending the EOI and receiving the next IRQ; and either increment a "back to back IRQs" counter (if the time was below a threshold) or zero the "back to back IRQs" counter (if the time was greater than the threshold). Then, if the "back to back IRQs" counter exceeds a max. value (e.g. 1000 "back to back IRQs" in a row) you've detected an IRQ flood.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: PCI devices and irq flooding
Thank you really much Brendan for that detailed post, you gave me quite a few pointers and I'll go make sure I follow them. I will try to write 0x00000000 to all device registers during enumeration (except video device & bridges?), and make sure that my pci interrupts are not installed as level triggered active high. I'll give a follow up post later as I'm at work. I have masked PIC and IO apic's and installed spurious, so hopefully that would not be the problem.
Re: PCI devices and irq flooding
Yes, you were correct Brendan, I had installed them as Level triggered active high instead of level triggered active low. That caused the issue, my driver now works perfectly