Just curious, what is the ICR set to on the first/'good' interrupt?
According to the 8254x manual interrupts are generated each time a bit in the ICR is set to 1b AND the interrupt is enabled. Any bits that were set at the time when the ICR is read are cleared. Also race conditions between software reading the ICR and hardware setting a bit are resolved by leaving the bit set (and I assume, although I haven't found a confirmation of this, the bit won't be set in the value of the ICR you read during the race condition but will instead cause another interrupt). This means that no interrupt cause bits can ever be lost, because you either read-and-clear them, or possibly-don't-read-and-certainly-don't-clear them, but you can't clear-but-don't-read them. In the code on your github the IMS is set to enable all interrupts in MODULE_INIT(), so there is the possibility of an interrupt being generated that you don't normally check for. Given that no bits can be missed, the number of interrupt causing events should equal the number of set bits read (to be 100% correct, each bit is multiplied by how many times the corresponding event occurred, e.g. if you receive two packets before you read the ICR the bit would only be set once but you would still read two packets in the receive buffer). So to receive two interrupts, either two bits must be set in the ICR, or one bit must be 'set twice' (the event occurred twice). I'm not sure if it's possible to still receive two interrupts if you already acknowledged the second one to the NIC before you received it, but that is the only reasonable explanation I can come up with, other than the emulator being bugged. The event causing the second interrupt can't have happened after the first time you read the ICR, because bits are only cleared by software, and interrupt causing events will always set their bit if they are enabled, so zero bits set means zero events occurred since last read. Another explanation could be that your interrupt handling code is wrong and this has nothing to do with the NIC itself, but that wouldn't explain why this doesn't happen with every interrupt or at least some interrupts from every (PCI) device.
If I understand it correctly, the 'nonsense' interrupts only arrive immediately after/when a packet is received (indicated by bit 7 in the ICR), and the only things that are likely to generate an interrupt as a (indirect) result of receiving a packet are a Small Receive Packet Detect (which as far as I can see you haven't enabled, in fact you haven't touched the RSRPD at all and these interrupts are disabled by default), reaching the Receive Descriptor Minimum Threshold (you've set it to 00b in the RCTL, meaning it is only generated each time the number of free receive descriptors is exactly half the total number of descriptors), or a Receiver FIFO Overrun which means there are either no descriptors available or the PCI bus was too slow resulting in the packet being dropped (in which case bit 7 would not be set because no new packet was written to memory, so the first interrupt in this case wouldn't have been a 'good' one either). The first one is impossible unless the emulated hardware has a different default state (which would be incorrect), or if something else has messed with the NIC. The other two, specifically the last one, are very unlikely considering you mentioned in another thread that your OS works fine on other emulators and on real hardware.
Also note that bit 7 in ICR is only set and the interrupt only generated each time a new packet is stored in memory, so the sequence: "IRQ -> read ICR -> packet still not handled -> IRQ -> packet was handled and ICR was quickly set to zero" is not possible.
8254x Family of Gigabit Ethernet Controllers Software Developer’s Manual, 13.4.18, ITR wrote:Software can use this register to pace (or even out) the delivery of interrupts to the host CPU. This register provides a guaranteed inter-interrupt delay between interrupts asserted by the Ethernet controller, regardless of network traffic conditions.
If this feature is supported by virtualbox, you could set the delay to the highest value possible to determine wether the second interrupt actually came from the NIC (in which case there would be a significant delay of about 1/60 seconds) or from something else (in which case there would be no delay at all). Additionally you could use the ITR to limit the interrupt rate if you are really worried about IRQ floods or IRQs arriving before the previous one has been handled. Also note that this register makes loops like this unnecesary/inefficient:
Brendan wrote:If you sent EOI at the right time (after your thread has told the kernel it has finished handling the cause of the IRQ) it'd be impossible for a second/unnecessary IRQ to occur before your thread starts.
Note: For gigabit Ethernet (where something else might cause an IRQ while it's handling an IRQ) I'd be tempted to do something like:
Code: Select all
do {
wait_or_whatever(); // block this thread (until something unblocks this thread)
if(IRQ_occured) { // if the thread was unblocked because an IRQ occured
do {
handle_cause(lastICR);
lastICR = ICR;
} while(lastICR != 0);
tell_kernel_finished(status); // Tell kernel I finished handling the cause of the IRQ
} else if {
// Handle other things that could've unblocked the thread
}
} while(running); // Go back to waiting (unless driver was terminated somewhere)
Instead of spending valuable CPU time on making sure you handle interrupts arriving at a fast rate in the minimum number of IRQs/EOIs you could just use the ITR to make sure interrupts don't arrive at such a fast rate, but are combined instead. On top of that, doing all this work before sending an EOI throws away the main advantage of the 8254x interrupt mechanism. The whole thing is designed such that the minimum required work (in terms of communicating with the device) to handle an IRQ is as low as possible: a single read from the ICR, that's it. They specifically made the ICR clear-on-read and contain all interrupt conditions so that you wouldn't have to do any more reads or writes to/from MMIO within an IRQ handler / before sending EOI. All other work could (but doesn't have to) be done on a separate thread, allowing you to control the priority of the thread without fear of not sending EOIs in time.
EDIT:
sleephacker wrote:The device has a maximum interrupt rate, making two interrupts immediately after each other very unlikely:
8254x Family of Gigabit Ethernet Controllers Software Developer’s Manual, 13.4.18, ITR wrote:Software can use this register to pace (or even out) the delivery of interrupts to the host CPU. This register provides a guaranteed inter-interrupt delay between interrupts asserted by the Ethernet controller, regardless of network traffic conditions. [...] The maximum observable interrupt rate from the Ethernet controller must never exceed 7813 interrupts/sec.
I don't know if that last sentence is taken into consideration by the emulator, but at least on real hardware the NIC would never send an interrupt immediately after another, and given the many millions of instructions CPUs can execute per second, you would probably be way past your IRET before the next interrupt is sent.
Actually, I misinterpreted that, there is no maximum interrupt rate if you haven't enabled the ITR, so the NIC could send one interrupt immediately after another. The rate of 7813 ints/sec was from an example setting, but because they didn't say: "in this case / with this setting, the interrupt rate must never exceed...", I thought it applied at all times.