AHCI controller reports success but reads no data

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: AHCI controller reports success but reads no data

Post by mariuszp »

Looks like my command line matches yours, and I enabled the extra tracing, thanks for the tip.

I've just tried it in QEMU with a slight change (setting SUD and POD after the COMRESET instead of before, as that's what you did). Same result but the trace shows something very strange:

Code: Select all

ahci_reset ahci(0x55b23a7ab300): HBA reset
ahci_reset_port ahci(0x55b23a7ab300)[0]: reset port
ahci_set_signature ahci(0x55b23a7ab300)[0]: set signature sector:0x01 nsector:0x01 lcyl:0x00 hcyl:0x00 (cumulatively: 0x00000101)
ahci_reset_port ahci(0x55b23a7ab300)[1]: reset port
ahci_reset_port ahci(0x55b23a7ab300)[2]: reset port
ahci_reset_port ahci(0x55b23a7ab300)[3]: reset port
ahci_reset_port ahci(0x55b23a7ab300)[4]: reset port
ahci_reset_port ahci(0x55b23a7ab300)[5]: reset port
ahci_mem_write_host ahci(0x55b23a7ab300) write4 [reg:GHC] @ 0x4: 0x0000000080000001
ahci_mem_read_32_host ahci(0x55b23a7ab300): mem read [reg:GHC] @ 0x4: 0x80000000
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x4: 0x80000000
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x4: 0x0000000080000000
ahci_mem_read_32_host ahci(0x55b23a7ab300): mem read [reg:PI] @ 0xc: 0x0000003f
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0xc: 0x0000003f
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0xc: 0x000000000000003f
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_mem_write ahci(0x55b23a7ab300): write4 @ 0x118: 0x0000000000000006
ahci_port_write ahci(0x55b23a7ab300)[0]: port write [reg:PxCMD] @ 0x18: 0x00000006
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_mem_write ahci(0x55b23a7ab300): write4 @ 0x118: 0x0000000000000006
ahci_port_write ahci(0x55b23a7ab300)[0]: port write [reg:PxCMD] @ 0x18: 0x00000006
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_mem_write ahci(0x55b23a7ab300): write4 @ 0x118: 0x0000000010000006
ahci_port_write ahci(0x55b23a7ab300)[0]: port write [reg:PxCMD] @ 0x18: 0x10000006
ahci_mem_read_32_host ahci(0x55b23a7ab300): mem read [reg:CAP] @ 0x0: 0xc0141f05
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x0: 0xc0141f05
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x0: 0x00000000c0141f05
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x118: 0x00000006
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x118: 0x0000000000000006
ahci_mem_write ahci(0x55b23a7ab300): write4 @ 0x118: 0x0000000000000006
ahci_port_write ahci(0x55b23a7ab300)[0]: port write [reg:PxCMD] @ 0x18: 0x00000006
ahci_mem_write ahci(0x55b23a7ab300): write4 @ 0x130: 0x00000000ffffffff
ahci_port_write ahci(0x55b23a7ab300)[0]: port write [reg:PxSERR] @ 0x30: 0xffffffff
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxSSTS] @ 0x28: 0x00000113
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x128: 0x00000113
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x128: 0x0000000000000113
ahci_port_read ahci(0x55b23a7ab300)[0]: port read [reg:PxSIG] @ 0x24: 0xffffffff
ahci_mem_read_32 ahci(0x55b23a7ab300): mem read @ 0x124: 0xffffffff
ahci_mem_read ahci(0x55b23a7ab300): read4 @ 0x124: 0x00000000ffffffff
ahci_mem_read_32_host ahci(0x55b23a7ab300): mem read [reg:PI] @ 0xc: 0x0000003f
We see near the top that it resets the port and puts in the signature 0x101 which is what we expect.

Next we see it reading PI thanks to which it discovers that port 0 is implemented.

We then see it progress through ahciStopCmd() with a bunch of reads from PxCMD,

Then it sets PxCMD.ICC = ACTIVE, reads PxCAP to find that staggered spin-up is not suported, and then sets PxCMD.SUD and PxCMD.POD (with no effect, they're already set).

It then writes all 1's to PxSERR, reads PxSSTS to find that DET=OK and IPM=ACTIVE, so proceeds to read the signature.

Now it reads the signature and the log indicates it returned 0xFFFFFFFF, even though it showed previously that the signature was set correctly. It shows that we have read the PxSIG port as we should.

I can see from the logs that none of the memory accesses were optimized out. Not really sure why it reports this... Maybe it's related to the problem on physical hardware.

EDIT: If I dump the TFD register at this point, its value is 0x7F, which is also the reset value. As if the reset is somehow not fully complete...

Looking at QEMU's code (hw/ide/ahci.c), pr->sig gets set in ahci_reset_port() to 0xFFFFFFFF, then later calls ahci_init_d2h() which sets pr->sig to the correct signature. The signature logged by QEMU is the one set in the IDE state, not in the PxSIG register. It only sets pr->sig if ahci_write_fis_d2h() succeeds; but it can fail if FRE is not set (which it isn't), implying I need to be able to receive FIS'es before I can read the signature.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: AHCI controller reports success but reads no data

Post by mariuszp »

I have refactored my code so that it sets up the memory structures and sets PxCMD.FRE before attempting to reset the port and read the signature, just line linuxyne's code does. Now it works perfectly in both QEMU and VirtualBox, but still fails in exactly the same way on physical hardware (PRDBC=0 etc). linuxyne's code works though, so there must be more difference, so I'll keep looking.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: AHCI controller reports success but reads no data

Post by mariuszp »

I found the difference between my code and linuxyne's after some refactoring, and I found what was causing the issue. I was setting the C bit in the command heaeder, which is supposed to "Clear the BSY bit when the command completes". When this bit is 0, the code works in all virtual machines and also real hardware. Setting this bit causes no issues on virtual machines, but stops my physical hardware from working. I'm not yet sure why that is, but hope this helps anyone who is stuck with a problem like this.
Octocontrabass
Member
Member
Posts: 5560
Joined: Mon Mar 25, 2013 7:01 pm

Re: AHCI controller reports success but reads no data

Post by Octocontrabass »

mariuszp wrote:which is supposed to "Clear the BSY bit when the command completes"
No, it clears the BSY bit when the device acknowledges reception of the FIS. You shouldn't set that bit when sending a command in a Register H2D FIS because the drive will reply with a Register D2H FIS to clear the BSY bit when the command is complete.

I'm not sure exactly why it only works in virtual machines, but I suspect it's related to power management. When the BSY bit is clear, the HBA thinks the device is idle, and it might be switching to a low-power mode before the drive can reply. Virtual machines don't emulate that.
linuxyne
Member
Member
Posts: 211
Joined: Sat Jul 02, 2016 7:02 am

Re: AHCI controller reports success but reads no data

Post by linuxyne »

The bit seems to be used only during the legacy software-reset-sequence.

Some recent details about its handling in qemu
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: AHCI controller reports success but reads no data

Post by mariuszp »

Thank you for your help, I understand what happened now then. As I was fixing other bugs in the AHCI driver (before I created this thread), I found at one point that the BSY bit was stuck on, and setting this "C" bit was one of the things I tried to resolve the problem. I guess it didn't solve it so I tried something else, and just forgot to clear the C bit again.

In my original code which I posted, you can see that the code for setting up the FIS and command header is repeated once for the IDENTIFY command, and once for the READ/WRITE (in ataTransferBlocks). The IDENTIFY code does the correct thing (does not set C), but it's being set in ataTransferBlocks. It was once I refactored this code to remove this repetition, by turning this into an ataIssueCmd() function, based on ataTransferBlocks(), that I noticed that now IDENTIFY is failing too. After trying every difference I finally found this C bit being the only one that has an effect (I tried everything in isolation), and that linuxnye didn't do it.

I think I was a little too early in my programming experience when I began this project, and now, having worked on some performance-critical, but simpler projects such as a game, and also working a professional programming job, I now have way more understanding for what's wrong with my OS code, and I have a better understanding on how to fix everything. I am really excited to get back into OSDev. Fixing this driver only gave me more motivation.
Post Reply