Page 2 of 3

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Sun Dec 03, 2017 7:13 am
by filkra
Korona wrote:So you still get the NMI after EHCI hand-off works?
I still get NMIs. I disabled all legacy support bits for all UHCIs and EHCIs and performed a BIOS handoff for the EHCI. SMI bits were set on both UHCI and EHCI, but BIOS owned semaphore was not set on the EHCI. Regardless of this, I set OS owned semaphore before attempting any USB transactions. Is it possible that current qTD in the overlay area points to 0x0 ? Right after setting next qTD to my SETUP qTD (getDeviceDescriptor), which is linked to the other qTDs (SETUP -> IN -> OUT), current qTD points to 0x0.
Korona wrote:Other things you should check (after you got the UHCI legacy-disabling to work): Is bus-mastering enabled in the PCI control register? Are PCI bridges set up correctly? Are EHCI registers marked as uncacheable in the MTRRs?
Yes, I enabled Bus Mastering, IO Space, and Memory Space. I haven't touched anything other than power management and the regular PCI control registers yet. Is it necessary to configure PCI bridges? Wouldn't BIOS do this? I think all registers are not being cached, as I get responses for certain actions (e.g. HCRESET).
Korona wrote:Can you test your code on other computers? Do you want me to test it on a desktop? I have a ICH7 desktop that I could run it on and copy/paste serial output and/or screenshots.
I can test it on a desktop, tomorrow. I don't think I'm allowed to distribute the code (or image), as this is a university project I work on with colleagues. But thanks for the offer :)

//EDIT

I checked the host controllers status register. Host System Error gets set right after the transaction. Next qTD is set to my SETUP qTD, but current qTD is set to 0x0.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Sun Dec 03, 2017 11:06 am
by Korona
filkra wrote:Yes, I enabled Bus Mastering, IO Space, and Memory Space. I haven't touched anything other than power management and the regular PCI control registers yet. Is it necessary to configure PCI bridges? Wouldn't BIOS do this? I think all registers are not being cached, as I get responses for certain actions (e.g. HCRESET).
Yes, the BIOS should set up PCI bridges and MTRRs. I never encountered a BIOS that does not do this correctly but I know that Linux has so code to check and/or reconfigure bridges for PCs with BIOS bugs. These bugs seem to be rare but they do exist. You could at least dump the (non-host-)bridge's setup and make sure that everything is correct.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Mon Dec 04, 2017 3:19 am
by filkra
Korona wrote:Yes, the BIOS should set up PCI bridges and MTRRs. I never encountered a BIOS that does not do this correctly but I know that Linux has so code to check and/or reconfigure bridges for PCs with BIOS bugs. These bugs seem to be rare but they do exist. You could at least dump the (non-host-)bridge's setup and make sure that everything is correct.
I checked the EHCIs PCI bus and it's 0, so (if I understood this correctly) PCI bridges shouldn't be involved, should they? Maybe I'm setting up the queue head the wrong way. Still, inside QEMU everything is working fine.

Is it wrong to set the control queue head (for the control endpoint) to address=0, endpoint=0, maxPacketSize=64, linkPointer=&idleQueueHead, pipeMultiplier=0x01 (1 transaction / micro frame)and speed=0x2 (high-speed). All other queue head fields are 0x0. After this I set the idleQueueHeads linkPointer to my new queue head. In fact, how do I know which speed to set before getting the device descriptor? Currently, I know it's a high-speed device, so I hardcoded it.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Mon Dec 04, 2017 12:22 pm
by BenLunt
filkra wrote:
Korona wrote:Yes, the BIOS should set up PCI bridges and MTRRs. I never encountered a BIOS that does not do this correctly but I know that Linux has so code to check and/or reconfigure bridges for PCs with BIOS bugs. These bugs seem to be rare but they do exist. You could at least dump the (non-host-)bridge's setup and make sure that everything is correct.
I checked the EHCIs PCI bus and it's 0, so (if I understood this correctly) PCI bridges shouldn't be involved, should they?
That should be a correct assumption, but someone with better knowledge should probably answer that one.
filkra wrote:Is it wrong to set the control queue head (for the control endpoint) to address=0, endpoint=0, maxPacketSize=64, linkPointer=&idleQueueHead, pipeMultiplier=0x01 (1 transaction / micro frame)and speed=0x2 (high-speed). All other queue head fields are 0x0. After this I set the idleQueueHeads linkPointer to my new queue head.
Some requests need the device to be in the addressed state, however, you are still trying to get the Device Descriptor, so address = 0 is correct. I just thought of something. You don't have more than one device attached do you? i.e.: there are not two or more devices in the default state? If so, this might cause a parity error.
filkra wrote:In fact, how do I know which speed to set before getting the device descriptor? Currently, I know it's a high-speed device, so I hardcoded it.
Simple. If you are on a UHCI or OHCI controller, check the bit in the port register for LS or HS. If you are sending packets via the EHCI, it is a HS. This works most of the time. Rate Matching Hubs, along with a few other things can be a factor. However, if you are sending packets via the EHCI due to the fact that the EHCI set the enabled bit on one of its ports, you can (mostly) assume it is a high speed device.

Ben

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Mon Dec 04, 2017 1:43 pm
by filkra
BenLunt wrote:I just thought of something. You don't have more than one device attached do you? i.e.: there are not two or more devices in the default state? If so, this might cause a parity error.
I attached only one USB thumb drive. I dumped the registers right after reading how many ports are available. This is how BIOS sets up the controller :

Image

My thumb drive is on Port 5. Port 8 has it's Connect Status Change and Port Owner bit set (Current Connect Status bit is cleared). After stopping, resetting and starting the controller this does not change. Just to be sure, how can I avoid multiple devices being in the default state? I thought, after resetting the host controller each port's device should only enter the default state after it gets reset (by using PORTSC Port Reset) by the host controller.

This is my configuration :

Image

It's almost identical to the BIOS configuration except I set the asynchronous and periodic list addresses and enabled both of them.

I noticed something after disabling legacy support on all UHCIs. After doing this, the NMIs occur once every 30 seconds. However, if I don't disable legacy support I get flooded by NMIs (multiple NMIs per second).

//EDIT

I checked the UHCIs port registers and it seems a device is attached. The second port on the fourth UHCI (this should be the same as the 8th port on the EHCI) has its Current Connect Status bit set. Maybe, this is the second device in default state.

//EDIT 2

I found out it's the ThinkPads fingerprint sensor. Still no luck in getting device descriptors :?

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Tue Dec 05, 2017 3:57 pm
by BenLunt
filkra wrote:My thumb drive is on Port 5. Port 8 has it's Connect Status Change and Port Owner bit set (Current Connect Status bit is cleared). After stopping, resetting and starting the controller this does not change. Just to be sure, how can I avoid multiple devices being in the default state? I thought, after resetting the host controller each port's device should only enter the default state after it gets reset (by using PORTSC Port Reset) by the host controller.
Reset and enabled. On a companion controller, UHCI or OHCI, you must set the enabled state. On the EHCI, the controller sets the enabled state. (This is not specific to companion controllers, UHCI and OHCI controllers were before USB 2.0, which states that the controller should set the enabled state).

What is the port register value of the companion controller (UHCI or OHCI) of the port that the finger-print reader is on? If it is in the enabled state and you place Port 5 (your thumb drive) in the enabled state, you may have two ports with a device each in the default state.

Have you reset each companion controller while initializing your EHCI controller?

The Port Owner bit in Port 8 (in your image) is set. This means the companion controller has control of that port. What happens when you write a zero to the Port Owner bit (bit 13) in Port 8?

Look at the companion controller's port value. Also try making the EHCI take back control of that port, then see if you can get the device descriptor or the thumb drive.

Ben

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Wed Dec 06, 2017 3:37 am
by filkra
BenLunt wrote:What is the port register value of the companion controller (UHCI or OHCI) of the port that the finger-print reader is on? If it is in the enabled state and you place Port 5 (your thumb drive) in the enabled state, you may have two ports with a device each in the default state.
The fingerprint sensors port reads 0x93 (Device Present, Connect Status Change, Line Status=0x1 and Low Speed Device Attached). I don't enable it.
BenLunt wrote:Have you reset each companion controller while initializing your EHCI controller?
I reset each UHCI (in ascending order) before touching any EHCI registers. Precisely I do the following:
  • 1. Enable Bus Master and IO Space
    2. Disable Legacy Support by writing 0x8F00 to the LEGSUP PCI configuration register at 0xC0
    3. Disable all interrupts by writing 0x0000 to the USBINTR register
    4. Stop the host controller by clearing Run/Stop in USBCMD and wait until Host Controller Halted in USBSTS gets set
    5. Reset the host controller by setting Host Controller Reset in USBCMD and wait until it gets cleared again
    6. Allocate an 4K aligned array with 1024 4 byte entries (Frame List) and set every entry to 0x1 (Terminate bit)
    7. Write 0x40 (1ms) to SOF
    8. Write 0x0 to FRNUM
    9. Write the allocated arrays address to FLBASEADD
    10. Start the host controller by setting Run/Stop in USBCMD and wait until Host Controller Halted in USBSTS gets cleared
    11. Set Configure Flag in USBCMD
    12. For each port : Set Port Reset in PORTSC, wait 100ms and clear it again
After this steps all ports read 0x80 (Default) except for the 4th UHCIs 2nd port which reads 0x93.
BenLunt wrote:The Port Owner bit in Port 8 (in your image) is set. This means the companion controller has control of that port. What happens when you write a zero to the Port Owner bit (bit 13) in Port 8?
The pictures from before were taken without resetting and configuring the UHCI. I only disabled Legacy Support. Now the Port Owner bit is cleared once I stop, reset and start the EHCI. These are my steps for the EHCI :
  • 1. Enable Bus Master and Memory Space
    2. Perform BIOS handoff using LEGSUP register and disable Legacy Support by writing 0x00000000 to LEGCTLSTS (EECP + 4)
    3. Read EHCI configuration (Frame List Size, Number of Ports, Version)
    4. Disable all interrupts by clearing all bits in USBINTR
    5. Stop the host controller by clearing Periodic Schedule Enable, Asynchronous Schedule Enable and Run/Stop in USBCMD
    6. Reset the host controller by setting Host Controller Reset in USBCMD and wait until it gets cleared again
    7. Write 0 to CTRLDSEGMENT (It's a 32-bit OS)
    8. Setup Periodic Schedule using the previously stored configuration (4K aligned, all T bits set)
    9. Setup Asynchronous Schedule by creating a Queue Head with all Bits set to 0 except for Head of Reclamation List Flag (next and alt qTD = 0x1)
    10. Set Interrupt Threshold Control to 0x08 (1ms)
    11. Start the host controller by setting Run/Stop and wait until Host Controller Halted gets cleared
    12. Set Configured Flag
    13. Enable Periodic and Asynchronous Schedule by Setting Periodic Schedule Enable and Asynchronous Schedule Enable and wait until USBSTS reports they are running.
    14. For each port :
    • 1. If Line Status == KSTATE handoff to the companion controller by setting Port Owner and continue with next port
      2. Disable the port by clearing Port Enabled in PORTSC
      3. Set Port Reset, wait 100ms and clear it again
      4. Wait until Port Reset is really cleared
After resetting Port 5 (the USB thumb drive) this is the EHCIs state :

Image

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Wed Dec 06, 2017 8:43 am
by BenLunt
Since you have verified that the device on the companion controller is in the non-enabled state after a reset, you can probably assume that it doesn't have anything to do with your NMI issue.

Without the machine physically in front of me, running my own tests, etc., I am at a loss. Sorry.

Ben

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Wed Dec 06, 2017 3:36 pm
by filkra
BenLunt wrote:Since you have verified that the device on the companion controller is in the non-enabled state after a reset, you can probably assume that it doesn't have anything to do with your NMI issue.
You are right. I opened the ThinkPad and removed the fingerprint sensors connector. After this neither UHCI nor EHCI detected a device on port 8. Still, NMIs occur once I insert a qTD. Can I try reading the device descriptor right after resetting the port or are there any additional steps involved? Currently, the only device attached is the USB thumb drive I am booting from.

//EDIT

Another question: Do I need paging enabled for the EHCI to work? The specifications states that the transfer buffer has to be virtually contiguous.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Thu Dec 07, 2017 8:50 am
by Korona
Parity errors are PCI errors. In theory, no USB transaction should ever generate parity errors, regardless of the devices that are attached to the USB (and in practice I assume this to be true until I'm proven wrong). Another thing that came to my mind: Do you configure any PCI devices other than EHCI? Do you maybe configure overlapping BARs or do you remap the BAR to some region occupied by system memory or by a legacy device? Are you sure that the region you DMA from/to is regular RAM? Did you check the BIOS memory map to verify that?

No, you do not need paging for EHCI to work. The buffer still needs to be per-physical-4k-page contiguous for EHCI to be able to address it. If it is physically contiguous, everything will work just fine.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Thu Dec 07, 2017 9:40 am
by filkra
Korona wrote:Do you configure any PCI devices other than EHCI? Do you maybe configure overlapping BARs or do you remap the BAR to some region occupied by system memory or by a legacy device? Are you sure that the region you DMA from/to is regular RAM? Did you check the BIOS memory map to verify that?
Currently I do only configure UHCI and EHCI and don't remap the BARs. I allocate 5 buffers (4k aligned and contiguous and 4k size). The buffers are located at memory addresses which are reported as FREE inside the memory map. Maybe the buffers aren't written to memory and stay cached. This would lead to writing to 0 instead of my DMA buffers. I declared all struct fields as volatile, so this should not happen, but who knows...

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Thu Dec 07, 2017 9:58 am
by bellezzasolo
filkra wrote:
Korona wrote:Do you configure any PCI devices other than EHCI? Do you maybe configure overlapping BARs or do you remap the BAR to some region occupied by system memory or by a legacy device? Are you sure that the region you DMA from/to is regular RAM? Did you check the BIOS memory map to verify that?
Currently I do only configure UHCI and EHCI and don't remap the BARs. I allocate 5 buffers (4k aligned and contiguous and 4k size). The buffers are located at memory addresses which are reported as FREE inside the memory map. Maybe the buffers aren't written to memory and stay cached. This would lead to writing to 0 instead of my DMA buffers. I declared all struct fields as volatile, so this should not happen, but who knows...
You'd need to configure your paging as non-cacheable or use sfence, since volatile only tells the compiler not to optomize the code. It does sweet nothing regarding the processor (or certainly, is not guaranteed to do so). So there's caching and pipelining to beware of.

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Thu Dec 07, 2017 12:07 pm
by Brendan
Korona wrote:Parity errors are PCI errors. In theory, no USB transaction should ever generate parity errors, regardless of the devices that are attached to the USB (and in practice I assume this to be true until I'm proven wrong). Another thing that came to my mind: Do you configure any PCI devices other than EHCI? Do you maybe configure overlapping BARs or do you remap the BAR to some region occupied by system memory or by a legacy device? Are you sure that the region you DMA from/to is regular RAM? Did you check the BIOS memory map to verify that?
I don't think it's a parity error at all.

According to Intel's I/O Controller Hub 7 Family Datasheet, bit 7 of IO port 0x61 is used to indicate that the NMI was caused by PCI's SERR# signal.

For EHCI, the datasheet says:
5.20.6.1 Aborts on USB 2.0-Initiated Memory Reads

If a read initiated by the EHC is aborted, the EHC treats it as a fatal host error. The
following actions are taken when this occurs:
  • The Host System Error status bit is set
  • The DMA engines are halted after completing up to one more transaction on the USB interface
  • If enabled (by the Host System Error Enable), then an interrupt is generated
  • If the status is Master Abort, then the Received Master Abort bit in configuration space is set
  • If the status is Target Abort, then the Received Target Abort bit in configuration
    space is set
  • If enabled (by the SERR Enable bit in the function’s configuration space), then the Signaled System Error bit in configuration bit is set.
Mostly, I suspect that EHCI may have been configured to read something from a dodgy address, gets an abort when it tries to read from that dodgy address, responds to the abort by asserting SERR#, and that triggers the NMI.


Cheers,

Brendan

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Fri Dec 08, 2017 6:07 am
by filkra
I think I found out, why I get NMIs / Master Aborts from the ThinkPads EHCI, after reading http://forum.osdev.org/viewtopic.php?t=20255&p=158920.
madeofstaples wrote:Aha! the problem I was having had to do with the EHCI's 64-bit addressing capability. Mine has this capability, but I overlooked that fact knowing the machine running this code only has a 32-bit processor (intel atom). Still, an EHCI controller with 64-bit addressing capability expects the extended data structures in Appendix B.
I just looked at my configuration and my EHCI also has the 64-bit addressing capability flag set... I will try it soon and keep you updated. Thanks for all your help!

Re: USB GetDeviceDescriptor causes NMI (Parity Check)

Posted: Fri Dec 08, 2017 6:49 am
by BenLunt
filkra wrote:I think I found out, why I get NMIs / Master Aborts from the ThinkPads EHCI, after reading http://forum.osdev.org/viewtopic.php?t=20255&p=158920.
madeofstaples wrote:Aha! the problem I was having had to do with the EHCI's 64-bit addressing capability. Mine has this capability, but I overlooked that fact knowing the machine running this code only has a 32-bit processor (intel atom). Still, an EHCI controller with 64-bit addressing capability expects the extended data structures in Appendix B.
I just looked at my configuration and my EHCI also has the 64-bit addressing capability flag set... I will try it soon and keep you updated. Thanks for all your help!
Yes, if the controller says it is 64-bit, you have to write zeros to the high dword of the addresses, however, only once if you wish.

If this fixes your issue, plug the finger-print reader back in and see if your can still read the descriptor (of the thumb drive). I bet so.

Ben