Page 6 of 8
Re: USB on real hw
Posted: Wed Aug 30, 2023 12:43 pm
by Klakap
Well, so it is either wrong initialization of devices before transfer, or your SETUP packet sends some wrong values. We need to get more informations. What value is in port register right before you send SETUP packet? And just to be sure, can you debug what are you sending through SETUP packet on real hardware?
Re: USB on real hw
Posted: Wed Aug 30, 2023 3:41 pm
by BenLunt
Your reset_port() call uses the port_set and port_clear defines, which in turn read in the port value, then write the port value back, either with an extra bit or minus an extra bit.
However, by doing so, at
Line 163, you are indeed clearing the Port Reset, but writing the CSC at the same time. This is the error I have been explaining. You should clear the reset by writing a zero to bit 9, but *do not* write a 1 to bit 1 at that time. Wait a few uS, then clear the CSC bit by writing a 1 to it.
The process of clearing the reset (clearing bit 9) *and* clearing the CSC (writing a 1 to bit 1) with the same write is what makes the port not become available. You *must* clear the reset (clearing bit 9) first, then with the port not in reset, clear the CSC (writing a 1 to bit 1) with another write.
I see that my example at
Line 283 does indeed have this error as well. I was unaware of this. I kept looking at my actual kernel driver code (which is local to my machine), not the example I have at the URL you mention. My apologies. Something I will fix ASAP.
Update: I updated the code starting at
Line 278. I didn't compile it, but a quick glance over looks good. Let me know what you find out.
Ben
Re: USB on real hw
Posted: Thu Aug 31, 2023 3:03 am
by Bonfra
It works!! now all of the initialization phase and device discovery works excellently! Thanks a lot :)
Well, I'm getting some randos page faults and general protection faults later on the msd initialization but I'm pretty sure is caused by some quirkiness in my code flow that happens to work in emulation but shouldn't really, so nothing related to the USB stack.
Well that's awesome news, all TDs are working fine and I can read all the descriptors I want.
Thanks again, I'll post some updates if those GPFs and PFs turn out to be caused by some data returned by the controller in some way.
Re: USB on real hw
Posted: Thu Aug 31, 2023 11:33 am
by Bonfra
Well actually one more thing before I proceed on fixing my bogus MSD implementation...
UHCI specs page 20 wrote:Frame Lists are aligned on 4-Kbyte boundaries. Transfer Descriptors and Queue Heads must be aligned on 16-byte
boundaries
In fact,
on line 207 I allocate TDs with alignas(0x10) and it works great.
Later, on the same function, where I
allocate the queue head at line 238 if the queue head is just aligned to 0x10 it doesn't work, it gets stuck (on real hw). If instead it's aligned to 0x1000 it works fine. Is there any specific reason for this or is it just some random quirk that I should respect?
Re: USB on real hw
Posted: Thu Aug 31, 2023 2:59 pm
by thewrongchristian
Bonfra wrote:Well actually one more thing before I proceed on fixing my bogus MSD implementation...
UHCI specs page 20 wrote:Frame Lists are aligned on 4-Kbyte boundaries. Transfer Descriptors and Queue Heads must be aligned on 16-byte
boundaries
In fact,
on line 207 I allocate TDs with alignas(0x10) and it works great.
Later, on the same function, where I
allocate the queue head at line 238 if the queue head is just aligned to 0x10 it doesn't work, it gets stuck (on real hw). If instead it's aligned to 0x1000 it works fine. Is there any specific reason for this or is it just some random quirk that I should respect?
"Allocate" seems a bit optimistic. You allocate it on the stack as an automatic variable.
I'm assuming you have 1:1 linear->physical page mapping as well, as you appear to be using the linear stack address as is.
You write this QH address straight to the frame list schedule, in each frame. So you're using a shotgun approach to ensure your QH is executed.
But, when your function is done, it makes no attempt to reset the frame list schedule to remove the presumably now processed QH, instead it is left on the schedule, and the UHCI controller will continue to examine it.
If you put it on the stack, what will be happening is that your stack location for the QH will be overwritten with other function values, which is probably messing up your controller as it'll be reading garbage from the QH.
I'd recommend:
1. Allocating both the QH and TD using non-automatic stack based storage.
2. Once your QH is processed, remove it from the schedule.
3. Design a better schedule, that points to static queue heads, onto which you can attach your temporary QH. If you have Ben's USB book (which I highly recommend), it will have the details of how to lay out the schedule.
Re: USB on real hw
Posted: Thu Aug 31, 2023 4:25 pm
by Bonfra
thewrongchristian wrote:
But, when your function is done, it makes no attempt to reset the frame list schedule to remove the presumably now processed QH, instead it is left on the schedule, and the UHCI controller will continue to examine it.
If you put it on the stack, what will be happening is that your stack location for the QH will be overwritten with other function values, which is probably messing up your controller as it'll be reading garbage from the QH.
Yup, that was the case, clearing the framelist before returning ensures it doesn't read some weird corrupted data.
This also explains why aligning to 0x1000 works since it possibly puts that memory way forward into the stack and it doesn't have the opportunity to override it.
Yes, I should switch to a better schedule like the one that Ben described both here in a previous post and in the book but for the moment I'm focussing on the goal of "this thing should work, performance later".
Anyway thanks for pointing out that mistake :) I'm getting back to work on that GPFs/PFs wich stalls the MSD code. Still convinced is some null pointer reference with corrupted memory
Re: USB on real hw
Posted: Fri Sep 01, 2023 1:27 am
by Bonfra
Alright, now every fault has been fixed and I'm back to fighting with USB.
Indeed it was bad memory management and things like that.
Now I think I'm having some problems with the bulk requests, this or some errors with SCSI.
RN it get stuck during
the second bulk in request so it should be
the one receiving the cvw.
Re: USB on real hw
Posted: Mon Sep 04, 2023 1:04 pm
by mid
Bulk transfers did not work for me when polling for USBINT. Instead I have it check each TD completion individually.
Re: USB on real hw
Posted: Mon Sep 04, 2023 4:14 pm
by BenLunt
Bonfra wrote:Alright, now every fault has been fixed and I'm back to fighting with USB.
Indeed it was bad memory management and things like that.
Now I think I'm having some problems with the bulk requests, this or some errors with SCSI.
RN it get stuck during
the second bulk in request so it should be
the one receiving the cvw.
Just a note, I fixed an issue in Bochs' UHCI code when a transfer request was more than 1280 bytes, which could have easily created memory errors in the guest, which showed up as SCSI errors. Possibly exactly what you are describing. It is
pull request #71.
Ben
-
https://www.fysnet.net/the_universal_serial_bus.htm
Re: USB on real hw
Posted: Fri Oct 06, 2023 12:57 pm
by Bonfra
I'm sorry university and work took the upper hand again...
So where did I leave off? Ah yes, nothing was working :) Nah kidding it's awesome that the control protocol works on real HW and I can print device information (like the various strings).
mid wrote:Bulk transfers did not work for me when polling for USBINT. Instead, I have it check each TD completion individually.
Do you mean that for bulk you spinned on each TD's STATUS_ACTIVE bit? Just tried and it doesn't seem to be working, at least the loop doesn't get stuck waiting for USBINT anymore but the CSW struct I pass to the function that got stuck before is left untouched. I also tried adding some random 1s delay but it still is giving the same none result.
BenLunt wrote:
Just a note, I fixed an issue in Bochs' UHCI code when a transfer request was more than 1280 bytes, which could have easily created memory errors in the guest, which showed up as SCSI errors. Possibly exactly what you are describing. It is
pull request #71.
I'm not sure if you are suggesting that my code implements the same mistake that BOCHS did or if my problem could be caused by a bug in BOCHS. If the latter is the case I'm anyway using QEMU (and the code works here) and real HW (the code doesn't work here). If instead you are saying that I'm doing something wrong (actually surely what is happening XD) I'm not sure what you are talking about.
I tried to debug a bit more in my little spare time but nothing more has emerged since last time, I'll update the post if something else pops out. Sorry again for my long absence.
Re: USB on real hw
Posted: Fri Oct 06, 2023 5:55 pm
by BenLunt
Bonfra wrote:I tried to debug a bit more in my little spare time but nothing more has emerged since last time, I'll update the post if something else pops out. Sorry again for my long absence.
Do you have a current image file I can run in Bochs?
- Ben
Re: USB on real hw
Posted: Sat Oct 07, 2023 2:17 am
by Bonfra
BenLunt wrote:
Do you have a current image file I can run in Bochs?
Sure,
here it is. Should match with the last commit.
Re: USB on real hw
Posted: Sun Oct 08, 2023 1:36 pm
by BenLunt
When I run your image in Bochs, the serial1.txt file shows:
src/interrupts/exceptions.c:84 #DE: Divide by zero exception
Somewhere you are dividing by zero, meaning an item probably isn't getting initialized as you think it should be.
This is done before any debug information on the MSD is displayed so fairly soon in your driver initialization.
Just after you reset the port, just before your reset it again, so somewhere between 1 and 5 below:
1) reset the port
2) clear reset
3) clear change bit(s)
4) get 8 bytes of descriptor
5) reset port again
Ben
Re: USB on real hw
Posted: Sun Oct 08, 2023 3:00 pm
by Bonfra
BenLunt wrote:When I run your image in Bochs, the serial1.txt file shows:
src/interrupts/exceptions.c:84 #DE: Divide by zero exception
Somewhere you are dividing by zero, meaning an item probably isn't getting initialized as you think it should be.
This is done before any debug information on the MSD is displayed so fairly soon in your driver initialization.
Just after you reset the port, just before your reset it again, so somewhere between 1 and 5 below:
1) reset the port
2) clear reset
3) clear change bit(s)
4) get 8 bytes of descriptor
5) reset port again
Ben
Well, this is a bit weird I thought I fixed that problem.
I'm gonna guess it's again somewhere in the calculation of the packet size.
Whenever I initialize a transfer I calculate the number of packets based on the size that the device supports, so maybe while I'm getting the first 8 bytes of the descriptor the size turned out to be zero and killed the execution:
Code: Select all
size_t pksiz = packet_size(bus, addr, endpoint);
size_t num_packets = size / pksiz + (size % pksiz != 0);
(in any function of
this file).
That packet size function relies on the device being already registered in the system. Since it is still resetting and nothing is set in place it falls back to calculating the packet size based on the status of the port: if its a full speed device it goes for 64, 8 for low speed and 0 if nothing is attached. This makes the driver go crazy if something bad happens but if something is attached to uhci it must be in either one of these states.
(
relevant code here)
Anyway, I'm facing this issue in QEMU or in real hw. Maybe you set BOCHS in some way that I was not expecting when I thought about this temporary packet size calculation.
I added more logging to that part of the code; just pushed the image.
That (if I recall correctly) is the only division that happens in that piece of code so this should help be 100% sure it's that
Re: USB on real hw
Posted: Tue Oct 10, 2023 3:59 pm
by Bonfra
I cannot replicate your error in any testing environment. If Bochs settings don't allow changing the speed of the device I'm gonna add a quick bypass to the code that assumes 8 as the supported size if the descriptor hasn't been read yet.
Let me know if you prefer me to apply this bypass directly or if you want to tinker with Bochs settings a bit.
Anyway I'm still assuming that this is the issue. If based on the logs of the newly pushed image it appears to be something different please point me in the right direction :)