USB on real hw

Bonfra · Post by **Bonfra** » Thu Aug 24, 2023 9:18 am

Klakap wrote:I am not exactly sure how you edit frame list right now. If you use this idea, you should do it in this way:
1. make all frame list entries invalid (set bit 0)
2. prepare your packets
3. rewrite all frame entries to be pointing to your packet structure

Yea pretty much what I'm doing:
1. clear the USBSTS_INT bit
2. set all of the frame list to TD_TERMINATE (bit 0)
3. prepare the queue head with all tds offline
4. clear the USBSTS_INT bit again
5. cycle trough every single element and point it to the new queue head
6. spin on the USBSTS_INT bit for competition (last td has the IOC bit set)

Klakap wrote: And where on hardware error happens? In SETUP stage/DATA stage/STATUS stage?

On the SETUP packet of the very first transaction, the one where I request the device descriptor.

BenLunt · Post by **BenLunt** » Thu Aug 24, 2023 9:24 am

thewrongchristian wrote:That seems like a bit of a faff, as well as a potential waste of peak device bandwidth.

According to the spec, any link fields are read-only to controller, so the controller will not be modifying any link fields., So long as updates to the link fields are atomic, that should suffice for the controller.

In my code, for each transfer, I build offline a QH with however many TD are required to transfer all the data for the request. Being offline, and not linked into the UHCI schedule, this will not be dependent on the when the UHCI reads data based on the schedule.

Say you want to insert the new QH2 AFTER QH1, you fill in QH2.next pointer with the QH1.next, so that both QH1 and the new QH2 point to the same next QH (or null, if there is nothing else on the end of the list.)

Then, you update QH1.next to point to QH2 in a singe write. This write is (should be) atomic, so the UHCI will either get the old value, or the new value, but either way, the chain of QH should be coherent, and being read-only to the UHCI, it will not be updated by the UHCI controller.

Hi,

I don't think you understood what I was trying to express. What If the controller has read QH1 just an instant before "you update QH1.next to point to QH2 in a singe write"? The controller will not see the update until the next time it comes around to see QH1 again, if it even does come back to QH1. Therefore, QH2 may possibly *never* get executed. Also, depending on how you construct your schedule, it could be 1023 frame times before it comes back to QH1, nearly a full 1024mS later.

My point is, and this is my humble opinion, software should not modify any QH (or TD) that can possibly be executed within the current frame. Always modify QHs and TDs that are guaranteed to not be executed until the next frame.

As for loss of peak device bandwidth, this technique loses no bandwidth as long as it is done correctly.

For example, all Frame List Pointers should point to an arbitrary count of ISO TDs, active or not, then a list of periodic QHs depending on the Frame Number value, then an arbitrary list of Bulk QHs, and finally an arbitrary list of Control QHs. The last Control QH should point to the first Bulk QH for Bulk/Control reclamation. Depending on your preference, the Control QHs can come first.

Having two complete lists as described in my previous post does not have any loss in bandwidth since any added QH, when inserted correctly, is guaranteed (errors aside) to be executed in the next frame.

Ben

BenLunt · Post by **BenLunt** » Thu Aug 24, 2023 9:30 am

Bonfra wrote:is a Mediafire download link ok?

Mediafire says the image file doesn't meet requirements.

Bonfra wrote:I have two queue heads (let's call them QA and QB), every even index of the framelist points to QA and every odd to QB. Every time I need to push some TDs I read the FRNUM register and if it it odd I replace the TD chain on QA (the even queue), if it's even I instead change QB.
This way every time the frame ends the cache is flushed and the memory is fully reloaded with the new things I just wrote, avoiding unintentional override to some memory that the device didn't think I'd modify.

Exactly.

Any QH or TD that you push to the list will be guaranteed (errors aside) to be executed in the next frame, less than 1000uS away. Also, with this technique any memory access will be guaranteed to be valid since the controller has not touched the memory before, during, or after the access (for the current frame time).

Ben

thewrongchristian · Post by **thewrongchristian** » Thu Aug 24, 2023 5:00 pm

BenLunt wrote:
thewrongchristian wrote:That seems like a bit of a faff, as well as a potential waste of peak device bandwidth.

According to the spec, any link fields are read-only to controller, so the controller will not be modifying any link fields., So long as updates to the link fields are atomic, that should suffice for the controller.

In my code, for each transfer, I build offline a QH with however many TD are required to transfer all the data for the request. Being offline, and not linked into the UHCI schedule, this will not be dependent on the when the UHCI reads data based on the schedule.

Say you want to insert the new QH2 AFTER QH1, you fill in QH2.next pointer with the QH1.next, so that both QH1 and the new QH2 point to the same next QH (or null, if there is nothing else on the end of the list.)

Then, you update QH1.next to point to QH2 in a singe write. This write is (should be) atomic, so the UHCI will either get the old value, or the new value, but either way, the chain of QH should be coherent, and being read-only to the UHCI, it will not be updated by the UHCI controller.
Hi,

I don't think you understood what I was trying to express. What If the controller has read QH1 just an instant before "you update QH1.next to point to QH2 in a singe write"? The controller will not see the update until the next time it comes around to see QH1 again, if it even does come back to QH1. Therefore, QH2 may possibly *never* get executed. Also, depending on how you construct your schedule, it could be 1023 frame times before it comes back to QH1, nearly a full 1024mS later.

My point is, and this is my humble opinion, software should not modify any QH (or TD) that can possibly be executed within the current frame. Always modify QHs and TDs that are guaranteed to not be executed until the next frame.

As for loss of peak device bandwidth, this technique loses no bandwidth as long as it is done correctly.

For example, all Frame List Pointers should point to an arbitrary count of ISO TDs, active or not, then a list of periodic QHs depending on the Frame Number value, then an arbitrary list of Bulk QHs, and finally an arbitrary list of Control QHs. The last Control QH should point to the first Bulk QH for Bulk/Control reclamation. Depending on your preference, the Control QHs can come first.

Having two complete lists as described in my previous post does not have any loss in bandwidth since any added QH, when inserted correctly, is guaranteed (errors aside) to be executed in the next frame.

Ben

So you're advocating a "current" list and a "next" list? So, only insert new QH onto the "next" list, which presumably will flip-flop each frame.

The problem with that is, how do you know what the controller considers the "current" frame? You can read the current frame number being processed from the UHCI FRNUM register. From that, you can derive the current and next frames.

But, reading FRNUM and acting on it is not atomic. You've inherently got a race between the driver and the controller, because by the time you've determined what your "next" frame list is, an interrupt could have happened and the UHCI controller has moved onto the next frame without you realising, so you'd be putting the QH on the now "current" frame, and you'd better do it in a manner that is coherent.

My actual QH list has a static QH list for the various interrupts (128ms at the head of the list, through 64ms, 32 ms etc down to, 1ms, then a control QH, then the bulk QH (I haven't yet added ISO support.):

Code: Select all

 qh128ms -> qh64ms ... qh2ms -> qh1ms -> qhcontrol -> qhbulk

Then off each of these static QH, there is a dynamic QH list for outstanding requests for that queue type. So, for example, consider the following two 1ms interrupt transfers queued before the control descriptors:

Code: Select all

->qh1ms ------> qhcontrol
    |            /
    \->QH1->QH2-/
        |    |
        TD   TD

Say QH1 is to the keyboard, and gets a NACK, but QH2 is for the mouse, and it gets an interrupt. When we process QH2, it will be removed from the list, resulting in:

Code: Select all

->qh1ms ------> qhcontrol
    |           /
    \->QH1-----/
        |
        TD

My point being, from the point of view of the controller, I see either the first or the second scenario, but both scenarios are coherent valid lists of QH, and will be processed in the right order each frame. Of course, if for some reason the QH2 interrupt handler has not yet removed QH2, and the controller again visits QH1 and QH2, QH2 will be a no-op because its TD are marked as complete. This is the case I currently don't handle correctly, and need to keep the QH2/TD around for another frame in case the controller re-reads QH2, but I know for sure that when I remove QH2, QH1 is still linked into the schedule and so will be visited again.

Bonfra · Post by **Bonfra** » Fri Aug 25, 2023 1:03 am

BenLunt wrote: Mediafire says the image file doesn't meet requirements.

Sorry for the inconvenience... apparently IMG files violate Mediafire terms of service
I've uploaded the same image to mega. Hope this time it doesn't get blocked.

I tried also implementing the double queues method (which should be quite the same as the one suggested by Klakap even if a bit faster) and I'm getting the same result: stalled and crc on the setup packet of the first device descriptor request :(

BenLunt · Post by **BenLunt** » Fri Aug 25, 2023 9:11 am

thewrongchristian wrote:But, reading FRNUM and acting on it is not atomic. You've inherently got a race between the driver and the controller, because by the time you've determined what your "next" frame list is, an interrupt could have happened and the UHCI controller has moved onto the next frame without you realising, so you'd be putting the QH on the now "current" frame, and you'd better do it in a manner that is coherent.

Indeed, unless you let the End-of-Frame interrupt routine insert the QH for you. :-)

Bonfra wrote:Sorry for the inconvenience... apparently IMG files violate Mediafire terms of service
I've uploaded the same image to mega. Hope this time it doesn't get blocked.

I tried also implementing the double queues method (which should be quite the same as the one suggested by Klakap even if a bit faster) and I'm getting the same result: stalled and crc on the setup packet of the first device descriptor request :(

I will be back at my desk this evening and will give it another try. Then I can see if I can find something that might be troubling you.

Ben

BenLunt · Post by **BenLunt** » Fri Aug 25, 2023 6:51 pm

I'm going to have to fall back to a previous post, and still state that I think it is your reset code.

For example, at Line 158 you wait 3 milliseconds. I have found, contrary to the UHCI specs, a wait of only 300 microseconds works, where a wait of 3 ms does not. Have you been able to work with your timer code yet to get a more accurate timing system?

Also, I ran your image via Bochs, some times I get the following:

Code: Select all

00433739093i[UHCI  ] Write to port: 0 0x0291
00436239219i[UHCI  ] Write to port: 0 0x0093
00436239219i[UHCI  ] UHCI Core: Clearing the CSC while clearing the Reset may not successfully reset the port.
00436239219i[UHCI  ] UHCI Core: Clearing the CSC after the Reset has been cleared will ensure a successful reset.

Sometimes I do not.

This means that somewhere you are resetting the port and then clearing the port with the CSC bit set. The CSC bit must be cleared only after the reset is cleared.

I can't find where you call the reset twice. I see reset function at Line 141 and then the call at Line 112, but I don't see where the other reset is called.

Please point out what file and line the first reset is made, then the file and line of the second reset.

Again, I really do think it is your timing of the reset. In my research, I found that the timing between the reset, clear reset, and enable port writes was crucial.

Ben

Bonfra · Post by **Bonfra** » Sat Aug 26, 2023 1:21 am

BenLunt wrote:I'm going to have to fall back to a previous post, and still state that I think it is your reset code.

For example, at Line 158 you wait 3 milliseconds. I have found, contrary to the UHCI specs, a wait of only 300 microseconds works, where a wait of 3 ms does not. Have you been able to work with your timer code yet to get a more accurate timing system?

I'm sorry I didn't update the code on github. Now it's all pushed. Yes that things was wrong and i changed some time ago. Now I've implemented basically a 1:1 adaptation of the algorithm you present in your book and that sleep is gone. (I've also used your repo which implements a delay within the loop waiting for the bort to become active)
I only have a little doubt on the part where you check for the ENABLE_CHANCHE and the STATUS_CHANGE bit to clear them if they are set: In the book you just clear them:

Code: Select all

// if either enable_change or connection_change,
 // clear them and continue.
 if (val & (UHCI_PORT_PEC | UHCI_PORT_CSC)) {
 uhci_clr_rh_bit(base, port,
 (UHCI_PORT_PEC | UHCI_PORT_CSC));
 continue;
 }

In the repo instead, you take a more complicated mask:

Code: Select all

#define UHCI_PORT_WRITE_MASK  0x124E    //  0001 0010 0100 1110
// if either enable_change or connection_change, clear them and continue.
    if (val & ((1<<3) | (1<<1))) {
      outpw(base + port, val & UHCI_PORT_WRITE_MASK);
      continue;
    }

BenLunt wrote:This means that somewhere you are resetting the port and then clearing the port with the CSC bit set. The CSC bit must be cleared only after the reset is cleared.

Maybe this is the code you are talking about? or maybe it was just old code that didn't match what I was actually doing.

Anyway everything stars when a PCI UHCI device is detected, everything is set and the controllers goes through the initialization phase which internally reset the device and later hands control to the generic USB driver which one by one resets all ports

thewrongchristian · Post by **thewrongchristian** » Sat Aug 26, 2023 6:59 am

BenLunt wrote:
thewrongchristian wrote:But, reading FRNUM and acting on it is not atomic. You've inherently got a race between the driver and the controller, because by the time you've determined what your "next" frame list is, an interrupt could have happened and the UHCI controller has moved onto the next frame without you realising, so you'd be putting the QH on the now "current" frame, and you'd better do it in a manner that is coherent.
Indeed, unless you let the End-of-Frame interrupt routine insert the QH for you.

I'm not sure this helps. I don't think the UHCI schedule stops while interrupts are processed. From the UHCI spec:

If an interrupt has been scheduled to be generated for the frame, the interrupt is not signaled until after the status
for the last complete transaction in the frame has been written back to host memory. This may sometimes result in
the interrupt not being signaled until after the Start Of Frame (SOF) for the next frame has been sent. This
guarantees that the software can safely process through Frame List Current Index -1 when it is servicing an
interrupt.

So, by the time the interrupt request processing started in your OS, the UHCI controller may have already moved onto the next frame. Of course, at this point, the frame pointer will be pointing at the next frame, so you can insert your new QH onto the frame after that, but it is still inherently racey, and necessarily adds another 1ms to the latency before the new QH will be processed.

I will stick with my atomic insert/removal.

VSlezak · Post by **VSlezak** » Sat Aug 26, 2023 3:01 pm

Bonfra wrote:

Code: Select all

- Find any EHCI controller on the PCI bus;
- Set its op.configFlag bit to zero so that it passes control to the companion (should be done by the BIOS since It's enabled in the settings but to be sure). Nothing else is touched in the EHCI controller (not even reset);

Are you releasing BIOS ownership of EHCI controller? Technically it should not matter, but I think we should be sure that your BIOS is not doing anything USB related. And on hardware just before SETUP transfer, what is value in port register?

BenLunt · Post by **BenLunt** » Sat Aug 26, 2023 4:58 pm

thewrongchristian wrote:
BenLunt wrote:
thewrongchristian wrote:But, reading FRNUM and acting on it is not atomic. You've inherently got a race between the driver and the controller, because by the time you've determined what your "next" frame list is, an interrupt could have happened and the UHCI controller has moved onto the next frame without you realising, so you'd be putting the QH on the now "current" frame, and you'd better do it in a manner that is coherent.
Indeed, unless you let the End-of-Frame interrupt routine insert the QH for you. :-)
From the UHCI spec:
If an interrupt has been scheduled to be generated for the frame, the interrupt is not signaled until after the status for the last complete transaction in the frame has been written back to host memory. This may sometimes result in the interrupt not being signaled until after the Start Of Frame (SOF) for the next frame has been sent. This guarantees that the software can safely process through Frame List Current Index -1 when it is servicing an interrupt.

Exactly, I couldn't have said it better myself. This guarantees that I can safely insert one or more queues in the "other" schedule, the schedule pointed to by Frame List Current Index -- 1 (or Frame List Current Index + 1), and I have approximately 1000uS to do it in. I might even get the dishes done in that amount of time. :-)

Also, I am guaranteed that this new QH, errors aside, will be executed in as little as 1000uS from now.

thewrongchristian wrote:I will stick with my atomic insert/removal.

Please do. If we all did the same thing, where would the fun be in that? You have your way, and I respect that, and I have my way.

Ben

BenLunt · Post by **BenLunt** » Sat Aug 26, 2023 5:00 pm

Bonfra wrote:Anyway everything stars when a PCI UHCI device is detected, everything is set and the controllers goes through the initialization phase which internally reset the device and later hands control to the generic USB driver which one by one resets all ports

Give me a little while to have a look.

Thanks,
Ben

Bonfra · Post by **Bonfra** » Sun Aug 27, 2023 4:21 am

Klakap wrote: Are you releasing BIOS ownership of EHCI controller? Technically it should not matter, but I think we should be sure that your BIOS is not doing anything USB related. And on hardware just before SETUP transfer, what is value in port register?

I eventually deleted that piece of code. I noticed that if I keep it every single device attached via USB is handed to the UHCI, instead if EHCI is left alone only 1.0 USB devices get handled by the UHCI controller. Since I own a 1.0 USB thumb drive I'm using that to test so that only one device shows up as attached and I have less data to debug.
Anyway, the behavior on that device I'm interested in is the same in any case; i.e. w/wo bios handoff and w/wo EHCI handing to UHCI.

VSlezak · Post by **VSlezak** » Sun Aug 27, 2023 7:08 am

If you pass from EHCI to UHCI all devices, are they all acting in same way, that first SETUP packet is stalled?

Bonfra · Post by **Bonfra** » Tue Aug 29, 2023 12:08 am

Klakap wrote:If you pass from EHCI to UHCI all devices, are they all acting in same way, that first SETUP packet is stalled?

Yup, same thing.

I'm sorry Ben, I just noticed the image i gave to you doesn't use the "one queue head on all the framelist" we discussed before, it's just a step before this implementation (which btw is the last thing i changed). Here is an updated version.

OSDev.org

USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw

Re: USB on real hw