Page 4 of 8

Re: USB on real hw

Posted: Sun Aug 20, 2023 5:19 am
by Klakap
Bonfra wrote:

Code: Select all

second one:
  - type = USB_PACKET_TYPE_IN
  - maxlen = 0x800;
  - buffer = NULL;
  - toggle = 1;
I think problem may be in maxlen 0x800. In STATUS stage you have to transfer zero bytes. So, when you take in account that in UHCI packets you calculate maxlen by (transferred length-1), and maxlen have 11 bits, you should have value 0x7FF.

Re: USB on real hw

Posted: Sun Aug 20, 2023 2:56 pm
by BenLunt
Klakap wrote:
Bonfra wrote:

Code: Select all

second one:
  - type = USB_PACKET_TYPE_IN
  - maxlen = 0x800;
  - buffer = NULL;
  - toggle = 1;
I think problem may be in maxlen 0x800. In STATUS stage you have to transfer zero bytes. So, when you take in account that in UHCI packets you calculate maxlen by (transferred length-1), and maxlen have 11 bits, you should have value 0x7FF.
This is correct. Good catch.

Sorry Bonfra, I had not made it back until now to have a look. Looks like Klakap beat me to it. Again, good catch.

Ben

Re: USB on real hw

Posted: Tue Aug 22, 2023 12:23 am
by Bonfra
Klakap wrote: I think problem may be in maxlen 0x800. In STATUS stage you have to transfer zero bytes. So, when you take in account that in UHCI packets you calculate maxlen by (transferred length-1), and maxlen have 11 bits, you should have value 0x7FF.
That 0x800 should automagically become 0x7FF when it's inserted as the maxlen of the TD.
Sorry, I forgot to explain better the part where I convert those USB packets into TDs for UHCI, I'll post the code here.

Code: Select all

for(size_t i = 0; i < num_packets; i++)
    {
        tds[i].link = i == num_packets - 1 ? TD_TERMINATE : ((uint32_t)(uint64_t)&tds[i + 1] | TD_DEPTH_FIRST);
        tds[i].flags = TD_STATUS_ACTIVE | TD_C_ERR;

        if(i == num_packets - 1)
            tds[i].flags |= TD_IOC;

        if(port_status(controller, addr) == USB_PORT_STATUS_CONNECT_LOW_SPEED)
            tds[i].flags |= TD_LOW_SPEED;
        
        tds[i].maxlen = ((packets[i].maxlen - 1)  << 21) | (endpoint << 15) | (addr << 8);
        
        switch (packets[i].type)
        {
        case USB_PACKET_TYPE_SETUP: tds[i].maxlen |= TD_PID_SETUP; break;
        case USB_PACKET_TYPE_IN: tds[i].maxlen |= TD_PID_IN; break;
        case USB_PACKET_TYPE_OUT: tds[i].maxlen |= TD_PID_OUT; break;
        }

        if(packets[i].toggle)
            tds[i].maxlen |= TD_DATA_TOGGLE;

        tds[i].bufptr = (uint32_t)(uint64_t)packets[i].buffer;
    }
Maybe I'm missing something in the calculation or such.. idk
All of the other packets (for example the setup ones) I set them to their full size (8 for the setup) and they get converted to that len -1 and all appears to work fine
BenLunt wrote: Sorry Bonfra, I had not made it back until now to have a look.
Nah don't even say it, I'm the one that disappears for long periods here XD

Re: USB on real hw

Posted: Tue Aug 22, 2023 7:21 am
by Klakap
I think it may be helpful if you would post how all your packets look like in hex values before and after processing. If error is in packets, it should be faster way to recognize it than reading all your code.

Re: USB on real hw

Posted: Tue Aug 22, 2023 9:53 am
by Bonfra
So.. there are a whole lot of things that are happening at the moment...
I have three USB ports on the laptop I'm using for testing, one of them gives a weird behavior where the device is successfully initialized but later the port disconnects for some reason. the other two 90% of the time get stuck on the phase where the device descriptor is requested (the first request between the two port resets), the rest 10% of the time it seems to work fine but every result of a bulk read of the MSD returns all zeros (probably a problem with the MSD driver so I'm counting this a functional occurrence of the USB driver)... I'm a bit cluttered with work but asap I'll post some screenshots with the VGA capture card showing all of the behaviours deeply logged with all of the bytes of the TDs (If I manage to replicate them all XD)

Re: USB on real hw

Posted: Tue Aug 22, 2023 5:48 pm
by BenLunt
Klakap wrote:I think it may be helpful if you would post how all your packets look like in hex values before and after processing. If error is in packets, it should be faster way to recognize it than reading all your code.
I would like to add to this request and ask for a (very) recent image file as well.

Thanks,
Ben

Re: USB on real hw

Posted: Wed Aug 23, 2023 1:21 am
by Bonfra
So here are all the TDs that are sent to the UHCI stack:

In this sample, the device never reacts to the first TDs requesting the device descriptor using the maximum size based on the device speed:
Image

In this sample, the device successfully respond t the first request but after a second reset its doesn't
Image

In this final sample, the device disconnects after it has been detected so checking the port speed again to calculate the packet split returns a zero
Image
BenLunt wrote:I would like to add to this request and ask for a (very) recent image file as well.
Would you like me to compile without VESA so you can test it on bochs or do you prefer the exact image that produced the outputs above?

EDIT
After a lot of reboots without changing a single thing I also got it to this other face:
Image
It get stuck while waiting for the configuration

Re: USB on real hw

Posted: Wed Aug 23, 2023 8:49 am
by Klakap
For me packets look right. Maybe Ben can catch something. But I would also like to ask for how packets looks like after each of those situations. UHCI will report why it did not transfer packet in “flags” field, so it is quite useful value for us. And I would also like to ask how are you filling Frame List.

Re: USB on real hw

Posted: Wed Aug 23, 2023 12:19 pm
by Bonfra
Here is a more verbose log printing the tds after every transaction (or, if they get stuck, after 10000 iterations waiting)

This sample is the one that gets stuck on the first descriptor:
Image

I'm not having any luck reproducing the other errors with this level of logging... I'll update the post as soon as I get some other behavior (other than the useless no device found)
Klakap wrote:And I would also like to ask how are you filling Frame List.
Basically, I'm starting and stopping the schedule for every transaction. before every operation, I clear it all with TD_TERMINATE and put a queue head as the first element pointing to the first TD so they get evaluated breath first.

Re: USB on real hw

Posted: Wed Aug 23, 2023 1:27 pm
by BenLunt
Bonfra wrote:Would you like me to compile without VESA so you can test it on bochs or do you prefer the exact image that produced the outputs above?
An image I can load and run with Bochs, please.
Bonfra wrote:Basically, I'm starting and stopping the schedule for every transaction. before every operation, I clear it all with TD_TERMINATE and put a queue head as the first element pointing to the first TD so they get evaluated breath first.
(Note that Breadth First is horizontal. Only one TD in that Queue will be executed, moving to the next Queue in this frame. Any concurrent TDs in any of these QHs will not be executed until the next frame or if any QH points back to any of these existing QHs. Are your intentions Depth First instead of Breadth First?)

In my opinion, it is not safe practice to start and stop the schedule. I am away from my desk, which means I am away from my notes, but if memory serves, the UHCI expects all memory to be static until the next frame. i.e.: during a frame, all memory that can be accessed by the controller is assumed to have already be written to physical memory. i.e.: you should not write to any Queue or TD that can be accessed by the UHCI during this frame (1ms). Same for reads, you should not assume any read is valid until the end-of-frame interrupt has been triggered.

Any read or write via an emulator will be instant. i.e.: as soon as the write is encountered, any further read from that memory will read the new value. However, on real hardware, the controller can and probably will read a full page (4k for example) of physical memory, using that cached page until a memory address is outside that 4k, triggering another 4k read. Therefore, any write to that 4k memory after the controller has read it, will not be valid until the next frame when the controller is required to read it again from physical memory.

At end-of-frame, the controller is to write back any memory that it may have modified (status flags, length, etc.), as well as when you stop the controller. What if you have written to a memory position that the controller thinks has been static, in turn writing back its values, over-writing your modification?

Again, this is all from memory, but if it serves me correctly, you should not write to any memory that the controller will access during that particular frame, and you must not read from any address that the controller can access during this frame until the end-of-frame is triggered.

A suggested way to do this is to have multiple periodical schedules within your frame's stack. i.e.: have every odd frame number point to one periodical stack, and every even frame number point to another stack. As for control/bulk TDs, do a similar technique. Every odd frame uses one stream of Control/Bulk Queues/TDs and every even numbered frame uses another stream. Then all you have to do is see which frame number the UHCI is on and use the correct periodical/control/bulk stream. With this technique, you can safely read and write to the "dormant" stream at will, as long as you take less than 1ms to make the modifications *and* none of the streams overlap each other. i.e.: no QH/TD in the odd stream is present in the even stream.

So, once you start the controller, you should not have to stop it until there is an error or a shut-down/sleep is triggered.

Does this make sense? I hope.

Ben
- https://www.fysnet.net/osdesign_book_series.htm

Re: USB on real hw

Posted: Thu Aug 24, 2023 1:49 am
by Bonfra
BenLunt wrote: An image I can load and run with Bochs, please.
Too big for the forum, is a Mediafire download link ok?
BenLunt wrote: Are your intentions Depth First instead of Breadth First?
Sorry, I meant depth-first, my bad.
BenLunt wrote: Does this make sense?
Probably yes but I'm gonna ask for confirmation:
I have two queue heads (let's call them QA and QB), every even index of the framelist points to QA and every odd to QB. Every time I need to push some TDs I read the FRNUM register and if it it odd I replace the TD chain on QA (the even queue), if it's even I instead change QB.
This way every time the frame ends the cache is flushed and the memory is fully reloaded with the new things I just wrote, avoiding unintentional override to some memory that the device didn't think I'd modify.

Re: USB on real hw

Posted: Thu Aug 24, 2023 7:20 am
by Klakap
For debugging purposes you can do it even simpler. Just rewrite all frame list entries to be pointers to your first packet(or queue head), so your transfer will be processed immediately on next frame, no matter of frame number.

Re: USB on real hw

Posted: Thu Aug 24, 2023 7:41 am
by thewrongchristian
BenLunt wrote:
Bonfra wrote:Basically, I'm starting and stopping the schedule for every transaction. before every operation, I clear it all with TD_TERMINATE and put a queue head as the first element pointing to the first TD so they get evaluated breath first.
(Note that Breadth First is horizontal. Only one TD in that Queue will be executed, moving to the next Queue in this frame. Any concurrent TDs in any of these QHs will not be executed until the next frame or if any QH points back to any of these existing QHs. Are your intentions Depth First instead of Breadth First?)

In my opinion, it is not safe practice to start and stop the schedule. I am away from my desk, which means I am away from my notes, but if memory serves, the UHCI expects all memory to be static until the next frame. i.e.: during a frame, all memory that can be accessed by the controller is assumed to have already be written to physical memory. i.e.: you should not write to any Queue or TD that can be accessed by the UHCI during this frame (1ms). Same for reads, you should not assume any read is valid until the end-of-frame interrupt has been triggered.

Any read or write via an emulator will be instant. i.e.: as soon as the write is encountered, any further read from that memory will read the new value. However, on real hardware, the controller can and probably will read a full page (4k for example) of physical memory, using that cached page until a memory address is outside that 4k, triggering another 4k read. Therefore, any write to that 4k memory after the controller has read it, will not be valid until the next frame when the controller is required to read it again from physical memory.

At end-of-frame, the controller is to write back any memory that it may have modified (status flags, length, etc.), as well as when you stop the controller. What if you have written to a memory position that the controller thinks has been static, in turn writing back its values, over-writing your modification?

Again, this is all from memory, but if it serves me correctly, you should not write to any memory that the controller will access during that particular frame, and you must not read from any address that the controller can access during this frame until the end-of-frame is triggered.

A suggested way to do this is to have multiple periodical schedules within your frame's stack. i.e.: have every odd frame number point to one periodical stack, and every even frame number point to another stack. As for control/bulk TDs, do a similar technique. Every odd frame uses one stream of Control/Bulk Queues/TDs and every even numbered frame uses another stream. Then all you have to do is see which frame number the UHCI is on and use the correct periodical/control/bulk stream. With this technique, you can safely read and write to the "dormant" stream at will, as long as you take less than 1ms to make the modifications *and* none of the streams overlap each other. i.e.: no QH/TD in the odd stream is present in the even stream.
That seems like a bit of a faff, as well as a potential waste of peak device bandwidth.

According to the spec, any link fields are read-only to controller, so the controller will not be modifying any link fields., So long as updates to the link fields are atomic, that should suffice for the controller.

In my code, for each transfer, I build offline a QH with however many TD are required to transfer all the data for the request. Being offline, and not linked into the UHCI schedule, this will not be dependent on the when the UHCI reads data based on the schedule.

Say you want to insert the new QH2 AFTER QH1, you fill in QH2.next pointer with the QH1.next, so that both QH1 and the new QH2 point to the same next QH (or null, if there is nothing else on the end of the list.)

Then, you update QH1.next to point to QH2 in a singe write. This write is (should be) atomic, so the UHCI will either get the old value, or the new value, but either way, the chain of QH should be coherent, and being read-only to the UHCI, it will not be updated by the UHCI controller.

Removing a QH is similarly an atomic single write, setting QH1.next from QH2.next (if we're removing QH2). Being a single write, it will be atomic. Thinking about it, though, the controller might be traversing this QH when you remove it, so you should perhaps remove QH2, then wait for the schedule to move to the next frame before reusing the QH. You can do this by putting the removed QH into a separate queue, not linked to the UHCI schedule, and after the end of frame interrupt, you can just walk the list of now removed and surplus QH, releasing the resources safe in the knowledge the controller has finished the frame so definitely won't be referencing them again. I currently don't do this waiting, so I can actually free the resources used by the QH/TD while the controller might be actually traversing them in the next frame, so that's something for me to fix!

Re: USB on real hw

Posted: Thu Aug 24, 2023 8:26 am
by Bonfra
Klakap wrote:For debugging purposes you can do it even simpler. Just rewrite all frame list entries to be pointers to your first packet(or queue head), so your transfer will be processed immediately on next frame, no matter of frame number.
Yeah it seems to be the same pretty much. I'm not targeting any fancy performance, just a proof of concept implementation.
The idea works great in emulation (without starting and stopping the schedule and just by clearing the IOC flag before replacing the whole framelist with the same pointer to the QH) but on real HW I get a different error than before. Now the status has bits 18 and 22 set. So CRC and Stalled. Weird

Re: USB on real hw

Posted: Thu Aug 24, 2023 8:54 am
by Klakap
I am not exactly sure how you edit frame list right now. If you use this idea, you should do it in this way:
1. make all frame list entries invalid (set bit 0)
2. prepare your packets
3. rewrite all frame entries to be pointing to your packet structure
And where on hardware error happens? In SETUP stage/DATA stage/STATUS stage?