Intel PRO/1000 spams interrupts in VirtualBox

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

(just to clarify, those addresses are mapped to the I/O APIC register space)
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Brendan »

Hi,
mariuszp wrote:I've done as you said but there is a problem. I have a function called irqMask() which masks interrupts, and irqUnmask() which unmasks them:
mariuszp wrote:Now when I test it on his VirtualBox installation - ONLY THAT ONE - the OS is once again stuck, handling interrupts indefinitely as they arrive (it seems). If you send an NMI, it is ALWAYS on some instruction within irqMask() or irqUnmask(), which comes right after reading from (*iowin). Is it a problem with those 2 functions (as shown above) or coudl it still something else?
I'm a bit confused.

If you did what I suggested, you'd only be calling "irqUnmask()" once when initialising a driver and only calling "irqMask()" once when terminating a driver. In that case it'd be possible (if something is wrong elsewhere) to get "IRQ flood" constantly interrupting
an instruction within "irqUnmask()" when initialising a driver, but not possible/likely to be constantly interrupting an instruction within "irqMask()" because the "IRQ flood" would cause problems before you terminate the driver.

In any case; I'd be tempted to suspect that you're messing with the wrong IO APIC entry in the first place. For example, rather than doing the "correct but hard" thing (which requires the use of ACPI AML to figure out which IO APIC input the PCI device actually uses) maybe you've done some kind of "easy but broken" thing that only works on some computers by accident.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

I mask the interrupt, send EOI, and then unmask it once the driver plays with the appropriate register.

I derive the PCI IRQ routing via ACPICA, I don't just look at the MADT.

What's happening is that the system crashes at one point (after initiailizing the keyboard, for some reason; but that's possibly a coincidence), and when interrupted with an NMI, I *always* catch it inside the irqMask() or irqUnmask() function, on the instruction that follows a read from (*iowin). The interrupts are disabled at this point (as confirmed by the pushed RFLAGS value); however, it is definetely not *stuck* there, because each time it is interrupted, the stack trace below irqMask()/irqUnmask() is competly different (it's running in different threads).

EDIT: Furthermore, I only seem to see 2 stacks: the keyboard driver's interrupt handling thread, and the Intel PRO/1000 driver's interrupt handling thread.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Brendan »

Hi,
mariuszp wrote:I mask the interrupt, send EOI, and then unmask it once the driver plays with the appropriate register.
I suspected...

The "mask then EOI early, then unmask at the end" scheme is an ancient hack used by some kernels to bypass the IRQ priority scheme in PIC chips, because the IRQ priority scheme in PIC chips is hard-wired and relatively silly (e.g. high performance devices like NICs end up with very low priority IRQs, while low performance devices like PS/2 keyboard ends up with very high priority IRQ).

For APICs the kernel controls the IRQ priority scheme directly (kernel decides what the IRQ's priority should be and selects the "interrupt vector" to reflect that), so there's very little reason to want "mask then EOI early, then unmask at the end".

Also note that for micro-kernels with drivers in user-space; the IRQ priority scheme is mostly irrelevant. The IRQ priority scheme determines the order IRQs are received by kernel, which determines the order kernel sends "IRQ occurred notifications" to threads in user-space (which usually causes the thread/s to be unblocked); but its the scheduler that decides which thread in user space will run when you switch from kernel back to user-space and not the IRQ priority scheme. For this reason, doing extra work (diddling with masking/unmasking IRQs) isn't going to make any difference or be worthwhile.
mariuszp wrote:I derive the PCI IRQ routing via ACPICA, I don't just look at the MADT.
How can you figure out how many IO APICs there are (and the physical address of each one) without looking at the MADT?
mariuszp wrote:What's happening is that the system crashes at one point (after initiailizing the keyboard, for some reason; but that's possibly a coincidence), and when interrupted with an NMI, I *always* catch it inside the irqMask() or irqUnmask() function, on the instruction that follows a read from (*iowin). The interrupts are disabled at this point (as confirmed by the pushed RFLAGS value); however, it is definetely not *stuck* there, because each time it is interrupted, the stack trace below irqMask()/irqUnmask() is competly different (it's running in different threads).

EDIT: Furthermore, I only seem to see 2 stacks: the keyboard driver's interrupt handling thread, and the Intel PRO/1000 driver's interrupt handling thread.
This is the same "IRQ flood" problem that Korona (correctly) diagnosed and described days ago.

Essentially; a device tells the IO APIC that it wants an IRQ, the IO APIC tells the CPU, the IRQ handler doesn't service the device but does EOI (with or without pointless masking/unmasking in there somewhere); and after this the device is still telling the IO APIC it wants an IRQ (because the driver didn't service the device properly) so the IO APIC tells the CPU again and.... this results in constant IRQs that never stop. You check the instruction at EIP (at any point in time) and it has to be something that's executed during IRQ handling because that's all the CPU is (repeatedly) executing.

There are only 3 likely causes of "IRQ flood":
  • You have no driver for the device causing the IRQ (and failed to "logically disconnect" the device from the PCI bus and/or failed to leave that IRQ masked)
  • You do have a driver for the device causing the IRQ and it fails to service the device properly (e.g. maybe the device is trying to tell you something like "my buffer is full" and your device driver is not doing anything to empty the buffer).
  • You messed up the IRQ routing (e.g. wrong device driver notified when IRQ occurs)

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Korona »

Brendan wrote:The "mask then EOI early, then unmask at the end" scheme is an ancient hack used by some kernels to bypass the IRQ priority scheme in PIC chips, because the IRQ priority scheme in PIC chips is hard-wired and relatively silly (e.g. high performance devices like NICs end up with very low priority IRQs, while low performance devices like PS/2 keyboard ends up with very high priority IRQ).
I would not call it a hack. It is true that you don't necessarily need the mask-then-EOI scheme if you're (a) using the I/O APIC and (b) ensure that EOIs are sent in the correct order. For microkernels the latter requires a priority-aware scheduler and the priorities of driver threads have to exactly match the IRQ priority (which is probably what you should be doing anyways). Note that it might still make sense to mask-then-EOI because doing that does not block a whole priority level but only a IRQ line (or a single device if you e.g. mask in the PCI control register). More importantly it enables the scheduler to do better decisions: If there are multiple IRQs of the same priority then all their handlers are woken up simultaneously and the scheduler gets to decide which ones to run first. This is likely better than the random decision by the APIC.

And for the XT-PIC the scheme is always necessary, because its priorities are stupid (as you said).

Regaeding the IRQ flood: Do you invoke the \_PIC AML control method to tell the firmware that you're using the APIC model, before reading _PRT? Interrupt routing might differ between the XT-PIC and the I/O APIC. For example some ICHs route PCI interrupts to IRQs 16-23 in APIC mode.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

Sorry,i meant i look at both the MADT and IRQ.from ACPICA.

I do not invoke the \_PIC method directly, and im not sure if ACPICA does this, i will have to check that.
But now, its not just PCI interrupts... The keyboard driver seems to get stuck servicing IRQ1 too.
I mask an interrupt when it arrives, and i unmask only once rhe interrupt has been serviced by a driver. In the PRO/1000 case, this is after i read the ICR and it is nonzero. For the keyboard driver, it is immediately if IRQ1 arrives. So if thePRO/1000 interrupt gets unblocked, it must be because aomething in tge ICR has been set...
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

Now it floods IRQ1... but nothing in it changed, except that it is now masked before EOI, and unmasked when the keyboard handler is called..
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Brendan »

Hi,
mariuszp wrote:Now it floods IRQ1... but nothing in it changed, except that it is now masked before EOI, and unmasked when the keyboard handler is called..
That's impossible; unless you've mis-configured the corresponding IO APIC input (e.g. used "level triggered" when the MADT says it should be "edge triggered").

Note: I tried to figure out what's going on (using the link to your source in your signature) and got lost. It's hard to navigate unfamiliar projects (especially when they're "larger"), and significantly worse when it's all jumbled up - e.g. "apic.c" containing almost nothing (with no indication if it's "local APIC" or "IO APIC" or both); and "idt.c" containing an eclectic mixture of stuff that has nothing to do with the IDT (including random scraps of IO APIC code). I failed to find any code to initialise the IO APIC at all.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

Brendan wrote:Hi,
mariuszp wrote:Now it floods IRQ1... but nothing in it changed, except that it is now masked before EOI, and unmasked when the keyboard handler is called..
That's impossible; unless you've mis-configured the corresponding IO APIC input (e.g. used "level triggered" when the MADT says it should be "edge triggered").

Note: I tried to figure out what's going on (using the link to your source in your signature) and got lost. It's hard to navigate unfamiliar projects (especially when they're "larger"), and significantly worse when it's all jumbled up - e.g. "apic.c" containing almost nothing (with no indication if it's "local APIC" or "IO APIC" or both); and "idt.c" containing an eclectic mixture of stuff that has nothing to do with the IDT (including random scraps of IO APIC code). I failed to find any code to initialise the IO APIC at all.


Cheers,

Brendan
https://github.com/madd-games/glidix/bl ... acpi.c#L88

And irqMask() and irqUnmask() are in idt.c.

The flood begins at (modules/ps2kbd.c):

Code: Select all

outb(0x64, 0x20);
Note that if irqMask() and irqUnmask() are NEVER called, the keyboard IRQ flood does not happen.Is something wrong with them?
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

If i only call irqUnmask() once the keyboard and mouse are fully initialized (for IRQ1 and IRQ12), it boots further but still gets stuck later. And EVERY TIME i send an NMI, it is ALWAYs stuck in irqMask() or irqUnmask(), and ALWAYS on an instruction following a read/write.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by mariuszp »

I changed the code so that it only masks level-triggered interrupts and never edge-triggered. Now it correctly receives all interrupts.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Brendan »

Hi,
mariuszp wrote:I changed the code so that it only masks level-triggered interrupts and never edge-triggered. Now it correctly receives all interrupts.
The code to parse "Interrupt Source Override Structures" (in ACPI's MADT) overlooks the fact that for both the polarity flags and the trigger mode flags "00b = conforms to the specifications of the bus". In this case you're supposed to use the "source bus type" to determine what the polarity and trigger mode actually are. For example, something like:

Code: Select all

    if(polarity_flags == 00b) {
        switch(source_bus_type) {
            case ISA:
                polarity_flags = ACTIVE_HIGH;
                break;
            case PCI:
                polarity_flags = ACTIVE_LOW;
                break;
            default:
                // PANIC! Unknown or unsupported bus (EISA, MCA?)
                break;
        }
    if(trigger_mode_flags == 00b) {
        switch(source_bus_type) {
            case ISA:
                polarity_flags = EDGE;
                break;
            case PCI:
                polarity_flags = LEVEL;
                break;
            default:
                // PANIC! Unknown or unsupported bus (EISA, MCA?)
                break;
        }
Also; I think you're doing "if ((irqFlags & 0x30) == 0x30)" to determine trigger mode (at line 112 in "acpi.c") and that this is wrong and should be "if ((irqFlags & 0x0C) == 0x0C)". If I'm right; it'd mean all level triggered IRQs are being treated as edge triggered.

I'd also recommend using your own flags, where a single bit is used for "level or edge triggered" and a single bit is used for "active high/low" (so that you don't need to care about the hideously ugly/silly "conforms to bus" and "reserved" mess that ACPI's flags have).

Don't forget that accessing the IO APIC is significantly slower than accessing normal RAM. It's "uncached", and typically it's in the "PCI to LPC" bridge with all the other slow legacy devices where your accesses have to go across slow PCI buses. For every IRQ you are doing 24 writes to the "regsel", 24 reads from "iowin" and one write to "iowin" to mask the IRQ, and then the same again later to unmask the IRQ. This adds up to a total of 98 significantly slow IO APIC register accesses for every IRQ. I still think "mask then EOI early, then unmask" is stupid (e.g. including breaking "send to lowest priority" and ruining hardware's ability to automatically balance IRQs across CPUs); but if you're going to do it then it can be done far more efficiently, with a total of 4 significantly slow IO APIC register accesses (that could've been avoided) for every IRQ instead of 98 of them.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Korona »

Masking edge-triggered IRQs is problematic because you might lose an IRQ if the edge-triggered IRQ is raised while it is masked. This cannot happen for level-triggered IRQs.
Brendan wrote:Don't forget that accessing the IO APIC is significantly slower than accessing normal RAM. It's "uncached", and typically it's in the "PCI to LPC" bridge with all the other slow legacy devices where your accesses have to go across slow PCI buses. For every IRQ you are doing 24 writes to the "regsel", 24 reads from "iowin" and one write to "iowin" to mask the IRQ, and then the same again later to unmask the IRQ. This adds up to a total of 98 significantly slow IO APIC register accesses for every IRQ. I still think "mask then EOI early, then unmask" is stupid (e.g. including breaking "send to lowest priority" and ruining hardware's ability to automatically balance IRQs across CPUs); but if you're going to do it then it can be done far more efficiently, with a total of 4 significantly slow IO APIC register accesses (that could've been avoided) for every IRQ instead of 98 of them.
Agreed. The __sync_synchronize() calls are unecessary too. __sync_synchronize() translates to mfence. However volatile gurantees that your compiler does not reorder accesses to the register and the "uncached" MTRR guarantees that the CPU does not reorder/cache accesses.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intel PRO/1000 spams interrupts in VirtualBox

Post by Brendan »

Hi,
Korona wrote:Agreed. The __sync_synchronize() calls are unecessary too. __sync_synchronize() translates to mfence. However volatile gurantees that your compiler does not reorder accesses to the register and the "uncached" MTRR guarantees that the CPU does not reorder/cache accesses.
Yes.

I'd also recommend having nice little "write_IOAPIC_register()" function and a "read_IOAPIC_register()" function; that acquires a lock, then sets "regsel" and read or writes to "iowin", then releases the lock. Without this (for multi-CPU) there's a risk that one CPU will set "regsel", then another CPU will set "regsel", then the first CPU will read or write to the wrong register.

Of course then you have to worry about using locks from within IRQ handlers (ensuring IRQs are disabled to avoid deadlocks, and spinning before you can acquire the lock, and lock contention).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply