PCIe NVMe MSI-X interrupts not being delivered by QEMU

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
mleks
Posts: 8
Joined: Fri May 31, 2024 2:05 am

PCIe NVMe MSI-X interrupts not being delivered by QEMU

Post by mleks »

I'm having a problem with MSI-X interrupts for a PCIe NVMe controller in my early-stage kernel on QEMU (8.2.2 and custom-compiled 9.1 with added tracing). Although I've enabled MSI-X and set up the MSI-X table correctly, and the NVMe controller is raising the right MSI-X IRQ vector, QEMU isn't delivering the interrupt to the LAPIC, or the Local APIC isn't handling it.
The interrupt handler for IRQ#33 should trigger on an I/O Command Set Command READ. It's not triggering, and the dptr and metadata point to 0xaa for data and 0x0 for metadata, which is strange.

Key points:
1. MSI-X is enabled in the NVMe controller's PCI configuration space.
2. The MSI-X table is properly set up, with entry 1 configured for vector 0x33.
3. QEMU logs show many "pci_nvme_irq_msix raising MSI-X IRQ vector 0" messages, and some "IRQ is masked" messages (I assume it's related to PINed interrupts) .

Code: Select all

pci_nvme_irq_masked IRQ is masked
pci_nvme_irq_masked IRQ is masked
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_irq_msix raising MSI-X IRQ vector 1
4. QEMU isn't calling apic_deliver_irq after raising the MSI-X IRQ vector #1. It only delivers:

Code: Select all

 apic_deliver_irq dest 0 dest_mode 0 delivery_mode 0 vector 0 trigger_mode 0"
 

5. Own tracing shows the correct MSI-X table address and IRQ vector in the pci_msi_trigger function in qemu hw/pci/pci.c.

Code: Select all

  
pci_nvme_irq_msix raising MSI-X IRQ vector 1
msix_mlk_01 MSI-X: MLK 01 dev.name=nvme
msix_mlk_02 MSI-X: MLK 02 dev.name=nvme
msix_mlk_03 MSI-X: MLK 03 dev.name=nvme
msix_mlk_04 MSI-X: MLK 04 dev.name=nvme
msix_mlk_05 MSI-X: MLK 05 dev.name=nvme
msix_mlk_10 MSI-X: MLK 10 dev.name=nvme, addr: 0xffff8000fee00000, data: 0x33
6. I'm using SMP=1, cpuid=0, lapic_id=0.
7. LAPIC timer works, and manually triggering IRQ#0x33 works.

The message data seems to be sent to "nowhere" (0xffff8000fee00000). I've checked paging-frame mappings, it OK, but this address of my LAPIC might be wrong, in spite of that it seems to be OK.

I found a post at [https://forum.osdev.org/viewtopic.php?t=37366 where the author manually sets a bit in the Pending Bit Array (PBA) during MSI-X initialization.
I'm not sure if this is necessary or if it could affect interrupt delivery.

I'd appreciate any suggestions on how to fix this issue or areas to investigate. If anyone has dealt with a similar problem or has experience with PCIe NVMe MSI-X interrupts on QEMU, your help would be very valuable.

Thank you for your help!
Octocontrabass
Member
Member
Posts: 5494
Joined: Mon Mar 25, 2013 7:01 pm

Re: PCIe NVMe MSI-X interrupts not being delivered by QEMU

Post by Octocontrabass »

mleks wrote: Tue Sep 17, 2024 4:43 amThe message data seems to be sent to "nowhere" (0xffff8000fee00000). I've checked paging-frame mappings, it OK, but this address of my LAPIC might be wrong, in spite of that it seems to be OK.
It is getting sent to nowhere. The CPU's page tables only apply to the CPU's view of memory.
mleks
Posts: 8
Joined: Fri May 31, 2024 2:05 am

Re: PCIe NVMe MSI-X interrupts not being delivered by QEMU

Post by mleks »

Octocontrabass wrote: Tue Sep 17, 2024 10:11 am
mleks wrote: Tue Sep 17, 2024 4:43 amThe message data seems to be sent to "nowhere" (0xffff8000fee00000). I've checked paging-frame mappings, it OK, but this address of my LAPIC might be wrong, in spite of that it seems to be OK.
It is getting sent to nowhere. The CPU's page tables only apply to the CPU's view of memory.
This simple suggestion opened my eyes. Thank you. I was using a virtual address instead of a physical address in the MSI-X Message Address field.
Post Reply