USB Ehci transaction error
USB Ehci transaction error
During the development of my EHCI controller driver I've kinda hit a wall, so I'm hoping anyone of you guys could help me out with this problem. I'm using VirtualBox to debug my EHCI driver and test it with different usb devices, the problem is that everytime I queue a transfer up, SetAddress request, a Queue Head with 2 QTD's (One setup, one In) it seems to crash VirtualBox, I've taken some screenshots of the progress it makes before crashing:
The blue region containts the state of the QH and the two QTD's before linking them into the async schedule, the red region contains the information to the transfer right before the crash. The overlay region is the members prefixed by '.'
It seems to start processing the first QTD, and then encounters an XActError, then VirtualBox crashes with this:
The port reset logic is located at
https://github.com/Fadekraft/MollenOS/b ... hci/port.c
I've really tried getting this to work the past few days, and I've compared this to how the qh/qtd looks on other os's and linux, and it seems consistent. I've also tried to tweak my port-reset logic, and I currently wait 100 ms after the reset is done before sending the Set Address request as I've encountered some recovery time is required, and nothing seems to change anything about VirtualBox crashing.
The blue region containts the state of the QH and the two QTD's before linking them into the async schedule, the red region contains the information to the transfer right before the crash. The overlay region is the members prefixed by '.'
It seems to start processing the first QTD, and then encounters an XActError, then VirtualBox crashes with this:
The port reset logic is located at
https://github.com/Fadekraft/MollenOS/b ... hci/port.c
I've really tried getting this to work the past few days, and I've compared this to how the qh/qtd looks on other os's and linux, and it seems consistent. I've also tried to tweak my port-reset logic, and I currently wait 100 ms after the reset is done before sending the Set Address request as I've encountered some recovery time is required, and nothing seems to change anything about VirtualBox crashing.
Re: USB Ehci transaction error
- Does vbox.log print any details about the reason it marks xacterr?
- vbox seems to be running on windows. Does generating its crash dump and opening it in windbg show any info?
- Being guided by the output in your post, it can be asked of UsbReserveAddress to change its address calculation to itr*32 + jtr. This is likely nothing related to the vbox crash.
- Lack of access to the source of vbox's ehci implementation can make the debugging difficult. It may be worthwhile to make minor changes (for instance, change the address in the SetAddress command, change the command altogether, etc.) to narrow down on the reason the crash and the xacterr are seen.
- vbox seems to be running on windows. Does generating its crash dump and opening it in windbg show any info?
- Being guided by the output in your post, it can be asked of UsbReserveAddress to change its address calculation to itr*32 + jtr. This is likely nothing related to the vbox crash.
- Lack of access to the source of vbox's ehci implementation can make the debugging difficult. It may be worthwhile to make minor changes (for instance, change the address in the SetAddress command, change the command altogether, etc.) to narrow down on the reason the crash and the xacterr are seen.
Re: USB Ehci transaction error
I have to believe there is something wrong with my ehci implementation since if i Connect a low speed or full speed device it correctly routes it to ohci and everything enumerates correctly. This occurs when connecting High speed devices.
And yes its really hard to debug - the log does not contain any information related to this except for loaded modules, stack information and a Call stack that only involves byte offsets into ntdll and VBoxVmm.dll. Ill try to look at the crash dumps
And yes its really hard to debug - the log does not contain any information related to this except for loaded modules, stack information and a Call stack that only involves byte offsets into ntdll and VBoxVmm.dll. Ill try to look at the crash dumps
Re: USB Ehci transaction error
Should xacterr also not set the halted bit (and turn off the active bit)? I am asking that based on QEMU's behaviour, where it also sets the halted bit whenever it sets xacterr.
Edit: Scratch that. I think that that is because the qtd is still under processing (as the latest status is still within the overlay).
Edit: Scratch that. I think that that is because the qtd is still under processing (as the latest status is still within the overlay).
Re: USB Ehci transaction error
The red output, which is under USB_REASON_DUMP, is preceded by USB_REASON_LINK.
Between USB_REASON_LINK and the red output under USB_REASON_DUMP, there's a (thrd_sleepex) delay of 1 second.
If the delay is increased several times, does that also delay the crash with a corresponding amount of time?
Moreover, the values xacterr=1,active=1,halted=0 within the overlay shows that vbox was very likely in the middle of updating the status.
Since the red output is not printed in response to an interrupt, but after an artificial and fixed (but sufficient) delay, we may have caught vbox in the middle of its update. This does narrow down the area which needs to be investigated within the vbox's host controller. If only the source were available.
Between USB_REASON_LINK and the red output under USB_REASON_DUMP, there's a (thrd_sleepex) delay of 1 second.
If the delay is increased several times, does that also delay the crash with a corresponding amount of time?
Moreover, the values xacterr=1,active=1,halted=0 within the overlay shows that vbox was very likely in the middle of updating the status.
Since the red output is not printed in response to an interrupt, but after an artificial and fixed (but sufficient) delay, we may have caught vbox in the middle of its update. This does narrow down the area which needs to be investigated within the vbox's host controller. If only the source were available.
Re: USB Ehci transaction error
Very keen observation by inspecting my code. I can tell you extending the delay does not delay the crash, sometimes it even crashes before the 1 sec delay - this run shown ITT did not crash before the 1 sec, but very shortly after. And from the observation it seems exactly like virtualbox crashed during the Update of the qtd in progress - exactly since the Active bit is still set.
Its deeply annoying i have no source code - does everything else look somewhat correct? The setup etc? Ive tried a lot of different things to get this working, but im completely at a loss
Its deeply annoying i have no source code - does everything else look somewhat correct? The setup etc? Ive tried a lot of different things to get this working, but im completely at a loss
Re: USB Ehci transaction error
The linking of the structures look okay; vbox was able to fill in the overlay. I am not yet familiar with the data toggle bits, but they might not be involved in the crash.
The set_address request is not the first control transfer to the device, is it?
Certain articles that I had followed in the past mentioned that the commercial systems first call get_device_descriptor to gather the max_pkt_size and only then call the set_address command. If your implementation too does the same, then there's at least one request whose setup transfer is completed successfully.
Suppose that there's a reason for vbox to raise the xacterr. Could it be that the setup buffer (which I think is not printed in the output) has problems?
Edit: The qtd has cerr set to 3. Does it mean that the vbox is obligated to retry the (failed) transaction until the count goes to 0? If we set it to 0, does the crash go away (hinting at the possibility that vbox has troubles retrying the failed transaction)? Or may be it simply loops forever retrying the transaction?
The set_address request is not the first control transfer to the device, is it?
Certain articles that I had followed in the past mentioned that the commercial systems first call get_device_descriptor to gather the max_pkt_size and only then call the set_address command. If your implementation too does the same, then there's at least one request whose setup transfer is completed successfully.
Suppose that there's a reason for vbox to raise the xacterr. Could it be that the setup buffer (which I think is not printed in the output) has problems?
Edit: The qtd has cerr set to 3. Does it mean that the vbox is obligated to retry the (failed) transaction until the count goes to 0? If we set it to 0, does the crash go away (hinting at the possibility that vbox has troubles retrying the failed transaction)? Or may be it simply loops forever retrying the transaction?
Re: USB Ehci transaction error
Interesting question about setting cerr to 0 - I will definately try that and report the result.
The SetAddress is actually the first request sent to the device, I know that Windows sends get device desvriptor first, but from the devices ive enumerated on both ohci and uhci this has not been a problem. Could this reall provoke a crash?
The SetAddress is actually the first request sent to the device, I know that Windows sends get device desvriptor first, but from the devices ive enumerated on both ohci and uhci this has not been a problem. Could this reall provoke a crash?
Re: USB Ehci transaction error
I can't comment about the behaviour of the hardware devices, but the emulators do cut corners. Lack of source is again a hindrance here. We hope that vbox does not force its internal state to depend on the sequences of the behaviour enforced by the systems it supports.
If set_address is the first command, may be we can try to send a simpler get_dev_desc first and see if that succeeds or not.
If set_address is the first command, may be we can try to send a simpler get_dev_desc first and see if that succeeds or not.
Re: USB Ehci transaction error
Ok, so tried with CERR set to 0 - the result is exactly the same, Status is 0x88 and Token is now 0x2 instead of 0xe (% cerr bits). VirtualBox still crashes and the transaction is still stuck at the first one that is copied into the overlay area.
As a side note: I don't believe the Usb Packet containing the SetAddress is any problem as it works with VirtualBox if i just use the OHCI controller instead of EHCI. I at first suspected my reset logic being wrong - but inspecting both statuses and the delay after reset everything looks just fine. I can try to send the get device descriptor before set address
As a side note: I don't believe the Usb Packet containing the SetAddress is any problem as it works with VirtualBox if i just use the OHCI controller instead of EHCI. I at first suspected my reset logic being wrong - but inspecting both statuses and the delay after reset everything looks just fine. I can try to send the get device descriptor before set address
Re: USB Ehci transaction error
Did you check the 64-bit flag? Does your code handle 64-bit addresses? It looks to me that your Queue heads are only 80 bytes in size, when a 64-bit controller will use a few more bytes. Can it be that you are thinking 32-bit addresses, yet the controller is doing 64-bit?
Just a thought.
Ben
Just a thought.
Ben
Re: USB Ehci transaction error
You might have hit the nail on the head!BenLunt wrote:Did you check the 64-bit flag?
Re: USB Ehci transaction error
Yes, VirtualBox uses 64 bit ehci - but i dont see where i am missing anything? I do account for extended buffers in my queue head by inspecting the QueueHeadOverlay structure?
Re: USB Ehci transaction error
The partial encapsulation of the extended buffers within the __BITS == 64 got me (erroneously) thinking that qh and qtd sizes got reduced when compiled with __BITS == 32.MollenOS wrote:I do account for extended buffers in my queue head by inspecting the QueueHeadOverlay structure?
Re: USB Ehci transaction error
Ahh I see - but as you said the structure members will always be present even if not compiled as 64 bit. I tried extending the delay from 1 second to 3 seconds and run a few times until the crash came after, nothing changes it's stuck on the first qtd with the Active + XActErr bit set. It seems it crashes while processing the first transaction. What could this indicate? I mean it would be obvious that the buffer pointers are wrong, but I've double checked them, and they are valid.
BenLunt; What did you mean by my QH is 80 bytes total? I did the count and it's 68 bytes without metadata, 86 with metadata. And from comparing with the EHCI specs my structures are correct even for 64 bit.
BenLunt; What did you mean by my QH is 80 bytes total? I did the count and it's 68 bytes without metadata, 86 with metadata. And from comparing with the EHCI specs my structures are correct even for 64 bit.