OSDev.org

Posted: **Fri Dec 18, 2020 7:56 pm**

Octocontrabass wrote:Does memory exist at those addresses? The NVMe controller doesn't provide memory to store the queues.

I know. And I believe memory does exist there, but I'd need to really mess around with my memory allocation routines to actually do validation with that, as I don't currently do that.

Octocontrabass wrote:It's not really on topic, but I'm curious what your other plans are without paging.

What do you mean? I'm using paging. I'm just not allocating virtual addresses for hardware-specific memory because I know that hardware would probably not be able to communicate using that.

Posted: **Fri Dec 18, 2020 8:30 pm**

Ethin wrote:And I believe memory does exist there, but I'd need to really mess around with my memory allocation routines to actually do validation with that, as I don't currently do that.

I've never seen a PC with memory at such high addresses! What does the firmware-provided memory map say?

Ethin wrote:I'm using paging. I'm just not allocating virtual addresses for hardware-specific memory because I know that hardware would probably not be able to communicate using that.

You're right that hardware can't use virtual addresses, but you can easily walk the page tables to translate a virtual address into a physical address before giving it to the hardware. Identity mapping sounds like it will be more work, since you'll need to resolve conflicts where a physical address is available but a virtual address is not (or the other way around).

Okay, I liked, hardware can use virtual addresses with the help of an IOMMU. It's usually meant for virtualization, though, so it won't be an option everywhere.

Posted: **Fri Dec 18, 2020 9:21 pm**

Okay, so I've solved the problem (I added in code to my memory manager's physical memory allocation function to determine if the memory I was mapping was usable or not, along with a new argument to request it to force the allocation -- which is used inPCIe because PCIe memory areas aren't marked as "free") and it maps everything properly now and there are no page faults. Now its getting stuck waiting for a response from the identify command. I tried setting the interrupt vector for the NVMe controller to 0xC8 -- does that actually change the interrupt vector it uses, or does that just change the value there and doesn't alter anything else?
Edit: So I just realize I screwed up and wrote a 0 to intmc and not 1. Still, I'm curious about my above question.
Edit 2: Okay, I'm definitely not getting *any* interrupts from the controller at all. I submit the identify command, then ring the admin submission queue tail doorbell by writing a 1 to it (that's my queue tail), then waiting on the interrupt to get something. But I never get anything even though I've unmasked *all* interrupts.

Posted: **Fri Dec 18, 2020 9:49 pm**

Ethin wrote:I tried setting the interrupt vector for the NVMe controller to 0xC8 -- does that actually change the interrupt vector it uses, or does that just change the value there and doesn't alter anything else?

If you're talking about the interrupt line register in the PCI configuration space, writing it doesn't do anything other than change the value.

Posted: **Fri Dec 18, 2020 10:05 pm**

Octocontrabass wrote:
Ethin wrote:I tried setting the interrupt vector for the NVMe controller to 0xC8 -- does that actually change the interrupt vector it uses, or does that just change the value there and doesn't alter anything else?
If you're talking about the interrupt line register in the PCI configuration space, writing it doesn't do anything other than change the value.

Yep, that's what I was talking about. I presume there's no way to change that vector?
Still haven't found any way of getting interrupts.... My handler is never being called, which means that the interrupt isn't getting triggered.

Posted: **Fri Dec 18, 2020 11:10 pm**

Ethin wrote:I presume there's no way to change that vector?

The easiest way to choose your interrupt vector is to use MSI or MSI-X.

If you're not using either of those for whatever reason, you can still program the interrupt controller to choose a different vector. On some computers, you can use ACPI to choose how the NVMe controller's interrupt line is attached to the interrupt controller.

Note that the interrupt line register is only useful if you're using the dual PICs: it tells you which one of the fifteen PIC inputs the device is attached to.

Posted: **Sat Dec 19, 2020 1:32 am**

Octocontrabass wrote:
Ethin wrote:I presume there's no way to change that vector?
The easiest way to choose your interrupt vector is to use MSI or MSI-X.

If you're not using either of those for whatever reason, you can still program the interrupt controller to choose a different vector. On some computers, you can use ACPI to choose how the NVMe controller's interrupt line is attached to the interrupt controller.

Note that the interrupt line register is only useful if you're using the dual PICs: it tells you which one of the fifteen PIC inputs the device is attached to.

Thanks. I'll go for MSI-X. Just one last question and then I'll get to hacking PCIe again: how do I know when I've reached the last capability in the capabilities list? Is it when the next capability pointer is 0?

Posted: **Sat Dec 19, 2020 10:36 am**

Ethin wrote:how do I know when I've reached the last capability in the capabilities list? Is it when the next capability pointer is 0?

Yes for both the capabilities list and the PCIe extended capabilities list.

Posted: **Sat Dec 19, 2020 12:06 pm**

Thanks! I'll hack on things and see where I get with it.

Posted: **Sat Dec 19, 2020 5:06 pm**

Alright, almost got it. Three questions for MSI-X:
1. According to the table size, its 11h. Since an MSI-X table entry is 16 bytes, would this mean I'd allocate BAR[BIR]+table offset-BAR[BIR]+table offset+(16*table size + 1)?
2. If my MSI-X table contains 18 entries/18 interrupts, how does each MSI-X interrupt map to interrupts in the IDT?
3. The MSI-X entry contains message data and a message address. What precisely does the message data do, and should I fiddle with the message address?

Posted: **Sun Dec 20, 2020 12:51 am**

Ethin wrote:1. According to the table size, its 11h. Since an MSI-X table entry is 16 bytes, would this mean I'd allocate BAR[BIR]+table offset-BAR[BIR]+table offset+(16*table size + 1)?

I think you have the right idea, but your parentheses are in the wrong place.

Ethin wrote:2. If my MSI-X table contains 18 entries/18 interrupts, how does each MSI-X interrupt map to interrupts in the IDT?

You can choose any vector from 0x10 to 0xFE, and you can assign a different vector to each MSI-X interrupt.

Ethin wrote:3. The MSI-X entry contains message data and a message address. What precisely does the message data do, and should I fiddle with the message address?

When the device wishes to raise the interrupt specified by that entry, it writes the message data to the message address. You have to set appropriate values in the address and data registers so that the host hardware will raise an interrupt in response to the device writing the message data to the message address.

Intel defines the appropriate address and data values for x86 hardware in section 10.11 of volume 3A of the SDM. Make sure you set the trigger mode to edge, MSI and MSI-X don't support level-triggered interrupts.

Posted: **Sun Dec 20, 2020 2:15 pm**

I think I almost got it. Now I'm suffering a really weird invalid opcode exception. It occurs when I try to set the last interrupt vector in the table. At first I thought it was happening because I was referencing an invalid vector, so I told it to go from 0 to tsize-2, and it still does it. My code is as follows for this part:

Code: Select all

let table: DynamicVolBlock<u128> = unsafe { DynamicVolBlock::new(memstart as usize, tsize as usize) };
(0 .. (tsize-1) as usize).for_each(|e| {
let mut entry = table.index(e).read();
entry.set_bit(96, false);
let mut msgaddr = 0u64;
msgaddr.set_bits(32 .. 64, 0);
msgaddr.set_bits(20 .. 32, 0x0FEE);
msgaddr.set_bits(12 .. 20, 0xFF);
msgaddr.set_bit(3, true);
msgaddr.set_bit(2, true);
msgaddr.set_bits(0 .. 2, 0);
entry.set_bits(0 .. 64, msgaddr as u128);
let mut msgdata = 0u32;
msgdata.set_bit(15, false);
msgdata.set_bit(14, false);
msgdata.set_bits(8 .. 11, 0x01);
msgdata.set_bits(0 .. 8, int as u32);
msgdata.set_bits(16 .. 32, 0);
entry.set_bits(64 .. 96, msgdata as u128);
table.index(e).write(entry);
});

Edit: Okay, I just debugged it and it wasn't ever raised -- the fault I mean. What's odd is that it raised the MSI-x vector, which means I did everything correctly (I think) but the interrupt still wasn't ever raised by the APIC. The controller notes the raising of the interrupt though in the qemu logfile. In my code I set bit 15 of word 1 of the message control register to 1, but not bit 14; according to the PCI base spec, v. 3.0, bits 14:11 are reserved.
Edit 2: And I was an idiot... Note to self: Table size is not bits 10:0 of dword 0 of the MSI-X capabilities list. Wow. I completely missed that.

Posted: **Sun Dec 20, 2020 4:07 pm**

Ethin wrote:What's odd is that it raised the MSI-x vector, which means I did everything correctly (I think) but the interrupt still wasn't ever raised by the APIC.

How are you checking whether the local APIC has received the IRQ? If your ISR is running, the LAPIC should say that it received the IRQ. (Make sure you're checking before you send the EOI.)

Posted: **Sun Dec 20, 2020 6:25 pm**

Octocontrabass wrote:
Ethin wrote:What's odd is that it raised the MSI-x vector, which means I did everything correctly (I think) but the interrupt still wasn't ever raised by the APIC.
How are you checking whether the local APIC has received the IRQ? If your ISR is running, the LAPIC should say that it received the IRQ. (Make sure you're checking before you send the EOI.)

I'm using the LAPIC, yes. My rust code contains a list of functions to attach to each interrupt (interrupts 48-255 primarily). There'll be some interrupt sharing, sadly, but its not as though Intel or AMD are going to allow us to make the IDT an arbitrary size to handlean arbitrary number of interrupts because that would undoubtedly require huge architectural changes. Anyway.
When I initialize the controller I register a handler function for the given interrupt. I have the handler function set to log something if its called, and I'm never seeing the message I told it to print be printed. The IRQ is already in the IDT, and the function (should) be working, but the interrupt is either never being fired/never being received by the chipset or I've messed something up somewhere. According to the NVMe controller log I still have set up, the controller is definitely raising the interrupt ("pci_nvme_irq_msix raising MSI-X IRQ vector 0"). As a side note, my controller config is 0x460001 and my AQA is set to 0x7ff07ff. My ASQ is located at 0x18f8b000 and the ACQ is at 0x154461000, if that's of any help. I'm honestly not really sure what's wrong here.
Edit: Yeah, definitely confused. I added trace events for the PIC, APIC and IOAPIC, and its definitely being received and (I believe)delivered:

[email protected]:apic_deliver_irq dest 255 dest_mode 1 delivery_mode 1 vector 231 trigger_mode 0

Here's how I set the bits when I was configuring my interrupts:

Destination ID: 0xFF (all processors)
Redirection hint indication: yes
Destination mode: logical
Vector: Randomly assigned (to prevent overuse of interrupts as best as possible)
Delivery mode: lowest priority
Level: reserved
Trigger mode: Edge sensitive

I'm trying a different interrupt configuration (with RH and DM clear and delivery mode set to fixed). I'm not going to try any of the other modes because I don't think any of those would actually work.
Edit: Okay, so just tried. And it failed. Again, the IRQ is definitely being delivered:

Code: Select all

[email protected]:pci_nvme_irq_msix raising MSI-X IRQ vector 0
[email protected]:apic_deliver_irq dest 255 dest_mode 0 delivery_mode 0 vector 218 trigger_mode 0
[email protected]:apic_report_irq_delivered coalescing 1
[email protected]:apic_report_irq_delivered coalescing 2
[email protected]:apic_report_irq_delivered coalescing 3
[email protected]:apic_report_irq_delivered coalescing 4
[email protected]:apic_report_irq_delivered coalescing 5
[email protected]:apic_report_irq_delivered coalescing 6
[email protected]:apic_report_irq_delivered coalescing 7
[email protected]:apic_report_irq_delivered coalescing 8

However, the IRQ is never actually being fired.
Edit: Okay, so I... Had another idiot moment again.

I seem to be having a lot of those. XD
Anyway, my error was really simple -- I was registering the handler *after* the controller had been set up. Everything's working fine, its just that there's no handler for that interrupt yet. So nevermind -- I got everything working (I think) and thanks for the help!

Posted: **Sun Dec 20, 2020 7:20 pm**

Okay, so I jumped to premature conclusions again. Might take a break from this tonight but am still confused on exactly where I'm going wrong. I used Brendon's post on the layout of both the message address and data registers, and my interrupt handler is being registered, so I'm just confused. I've tried lowest priority, fixed and ExtInt delivery modes.
My code for PCIe/MSI-X is over here: https://github.com/ethindp/kernel/blob/ ... src/pci.rs (lines 261-281).
I would appreciate the help but I'm taking a break from it tonight, I think.

OSDev.org

NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion

Re: NVMe confusion