Page 1 of 1
EHCI malarchy
Posted: Thu Jul 11, 2013 2:00 pm
by Corry
I'm a bit stumped here. Controller is 8086:3b3c (one of the intel ICHs...). Basically, the gist of the problem is if I take ownership, reboot (without resetting the owner back to BIOS), on the next boot reading from the BAR, I get 0xFFFFFFFFs.
Here's what I have tried so far:
Initially I thought, just turn it back over to BIOS....re-enabled all SMIs, but nothing. Still get -1's
So perhaps it needs an MTRR set for it. When the controller works, it doesn't have an MTRR entry, but maybe something is sticky....so I set it anyhow. Nope, nothing.
I check power management. Maybe its in D3? Nope, its set to D0. I then tried cycling to D3 then back to D0. Nothing.
I copy down all of the PCI config space for when things work and when they don't (256 bytes), and other than the ownership, its all the same.
Now here's the kicker...windows is installed on the local hard drive of the machine. After messing things up (taking and not relinquishing ownership), I boot up windows, and it has no problem accessing the controllers. Worse, I power off the machine while windows was running, boot into my OS. Owner is still set to OS, but I can read from the BAR pointer! Clearly there is *something* I can do, but I'm running out of stupid ideas to try. Anyone have any more (preferably smart) ideas on what else to try? I'm about at my wits end on this thing!
Sure I can workaround it, but the stupid BIOS already doesn't wait long enough for the external hard drive to become ready, so I have to sit and hover, pause the machine, wait for the drive to spin up, then continue, then boot, which 1/2 of the time will fail anyhow....not my idea of a good time!
Here's to hoping there's something really simple I'm missing!
Re: EHCI malarchy
Posted: Thu Jul 11, 2013 7:26 pm
by thepowersgang
Aaah... "reboot"? I assume you mean a controller reset, as a machine reset will just take you back to where you started. Once you've read and mapped the BARs, you shouldn't need to refer to them again (until the machine is rebooted)... but they should still stay correct, and should never read 0xFFFFFFFF (the bottom bits of a memory BAR indicates its type and range, and is read-only)
Re: EHCI malarchy
Posted: Thu Jul 11, 2013 8:24 pm
by Mikemk
Disassemble Windows?
Re: EHCI malarchy
Posted: Thu Jul 11, 2013 8:58 pm
by Corry
I do actually mean reboot! Crazy I know! Thus the title with the word malarchy!
My guess was BIOS is supposed to do.....something.....if it thought it was supposed to. Since it comes up as OS owned, I figured it decided to go hands off. That value, in this laptop, survives a power off. I didn't try pulling the battery and waiting a couple minutes, but I assume that would in fact force it to be set to defaults, and everything would work as expected. I was happy to just call it a BIOS bug and move on until windows was able to use it without a hitch. Clearly, there is that aforementioned something I could do to force the thing into an operational state, if I knew what that something was!
Let me clarify though. No matter what, when I read the BAR, its pointing to 0xFEDA1C00 (IIRC, its a work, and I'm at home now, its somewhere near the top of the 4GB space though). Upon reboot with it working or not, it comes up as the same value. When I access memory at said address, when the controller is marked initially as BIOS owned, I get the registers of the EHCI controller. When it reboots as OS owned, and I access that address, I get -1. Thats where I started trying all of my shennanigans trying to get it back to the EHCI regs like windows does. I assume linux would work fine too, though I haven't tried. I've been going through the pci-quirks.c file for linux hoping to find an answer. I suppose worst case is I compile a custom linux with tons of debug output to see WTF it does, but I had hoped I wasn't the only one to run into this issue, and someone else could quickly point me in the right direction!
Re: EHCI malarchy
Posted: Thu Jul 11, 2013 9:13 pm
by Corry
I suppose I should also mention a few details about the OS.
First, calling it an OS is a joke, its not intended to be that, more of an embedded single task program...
Second, because of the first, its ring 0 only
To simplify things, it is in protected mode, but memory is mapped flat, no paging (yet), again to keep things simple. (eventually I'll hit up long mode, so eventually I have to do paging)
It i supposed to boot from USB, UHCI and OHCI are implemented for what I need (bulk/control to boot from USB), and I'm looking to do the same for EHCI, and if I feel particularly bored, stupid, zealous, or some combination thereof, I'll go for XHCI too (at least Bochs emulates it and I can use that to tell me what stupidity I'm committing)
I think that should be all the relevant info....
Like I said, something beyond the OS Controls are surviving reboots, and therefore likely getting power from standby. They trigger BIOS not to do something which actually lets me see the values in the HC regs, or perhaps BIOS is FUBARing the HC, either way, windows for sure deals with it without issue, and I assume linux would as well. Any guesses as to what those somethings might be?
Re: EHCI malarchy
Posted: Mon Jul 15, 2013 1:37 am
by Combuster
Like I said, something beyond the OS Controls are surviving reboots
What kind of machine is it, how do you "reboot" it?
Re: EHCI malarchy
Posted: Mon Jul 15, 2013 7:38 am
by Corry
Its a modern dell laptop, core i7 nehalem, 16GB ram.
To "reboot" it, I have to physically power off the machine with the power button. (4 second hold). I guess that leads to more eventual questions though, so let me add a bit more about what I'm doing
I'm not really using interrupts. I have the timer interrupt and processor faults go to a function which tells me what fault occured, and a bit of information about where it occured. Thats it though. The rest get mapped into a "null" handler which just acknowleges the interrupt and returns. Again, the whole point of this was to be simple, so I didn't want to parse out ACPI. I figured for one device (the usb controller) I could just poll it.
There is no keyboard input. I understand its a pretty simple thing to add, but this is supposed to be totally autonomous, so why poll the keyboard?
There is text video output. An unfortunate necessity for debugging
(Well, I suppose I could have used the serial port, but that would still involve more code)
I think that should answer most other questions....
Anyhow, further thinking on the problem makes me think its not that the HC isn't responding, my guess is its an error condition. The machine has 16 GB of ram in it, so if BIOS was ignoring the HC, it should be mapping physical ram into that space, which shouldn't result in -1 being read no matter what is written.
Looking at pci-quirks.c in linux, they have cases for when they never "take control" of the chip. I could do the same and just disable all SMIs like they do, but they only seem to do that in some tablet computers. As a quick test for a workaround, I tried taking ownership, disabling all SMIs, then "relinquishing" control. Theoretically, I should still have full control of the hardware, but when the machine reboots, I don't get the stupid error condition. I really don't like this hack for the simple reason that the standard MS, and linux drivers seem to bring the controller up fine no matter what state it is in. The only thing I can think of is perhaps something in the AML script resets the controller somehow? As I said before, I'm not doing anything with ACPI, so I suppose its a possibility. Still, I should be able to replicate that one piece of functionality right?
Re: EHCI malarchy
Posted: Mon Jul 15, 2013 11:40 am
by Corry
Hmmmm.....after a lot more reading on the topic, and applying various fixes for various operating systems, I'm going to guess its something "chipset related" which to me means device specific, which further translates to not worth it (unless someone out there knows for a fact that the same technique would work on all intel ICH's, then it might be worth the investigation). So unless someone has some really bright idea, I think I'm shelving this issue and will just go with the ownership hack+reboot that seems to work. Its a massive pain, but not a show stopper.
Re: EHCI malarchy
Posted: Fri Sep 13, 2013 3:33 pm
by ehenkes
I'll go for XHCI too (at least Bochs emulates it and I can use that to tell me what stupidity I'm committing)
Bochs is not the best way to develop for xHCI, because it reports errors where everything is ok, e.g. Bochs does not know the No Op command and reports "unknown TRB", etc.
In VMware you are able to read crcr, which has to provide zero if you try to read, etc.
Therefore, I test everything on real hardware.
Re: EHCI malarchy
Posted: Fri Sep 13, 2013 9:07 pm
by Corry
I only wish I had gotten to XHCI...That project is on hold/canceled unfortunately. I never could get the thing to send after the initial one. I'd like to go back to it, but only after I have some idea of wtf was wrong with it
Re: EHCI malarchy
Posted: Sat Sep 14, 2013 2:56 am
by rdos
I don't think you need to take ownership of EHCI. I would leave that part out and assume BIOS wouldn't do anything with EHCI anyway after the OS have taken control. OTOH, this might not work without enabling ACPI.
Re: EHCI malarchy
Posted: Sat Sep 14, 2013 10:32 am
by Corry
Oh, it's been a while since I posted this I didn't remember which issue I posted about. The ownership wasn't the only problem, but it could be worked around...I stopped working on it for another issue all together. I could send out an initial device config request and get the max packet size but all future requests would fail. The controller would copy the next request, and report success but return an empty buffer, and wouldn't pick up another request. With it reporting success, I had no idea where to look to figure the issue out, and because of the time it was taking, it was shelved. I could probably still go back to it, but without some direction it's pointless
Re: EHCI malarchy
Posted: Sat Sep 14, 2013 10:36 am
by Kortath
Corry, thanks for posting this thread. It's been one I have been watching for a while now. This is not an easy topic either. Good questions.