AHCI initialization sequence for real hardware

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Octocontrabass
Member
Member
Posts: 5587
Joined: Mon Mar 25, 2013 7:01 pm

Re: AHCI initialization sequence for real hardware

Post by Octocontrabass »

Bonfra wrote:I thought that the assignment operator would copy the value, or it with the argument and write it back
It will, and that's the problem:
AHCI 1.3.1 section 5.5.1 wrote:the previous register content of PxCI should not be re-written
When you want to set a bit in PxCI, you should write a value with only that bit set, and write all other bits as zero regardless of their current value.
Bonfra wrote:Can you expand on this thing here? I'm not sure what you are talking about
Each PRDT can describe up to 0x400000 bytes, and only an even number of bytes. You're dividing the transfer into chunks of 0x3FFFFF bytes, which is one less than the correct amount and not an even number.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

xeyes wrote: Polling the bit set in CI is enough to detect cmd completion. No need to look at IS.
so in general the corresponding bit in PxCI should be cleared upon command competition, either it succeded or failed. This does happen In the VM but in real hw it sits there spinning like spinning with the other condition
Regards, Bonfra.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

Octocontrabass wrote:
AHCI 1.3.1 section 5.5.1 wrote:the previous register content of PxCI should not be re-written
When you want to set a bit in PxCI, you should write a value with only that bit set, and write all other bits as zero regardless of their current value.
Oh ok it has to be just

Code: Select all

port->ci = (1 << slot.index);
Octocontrabass wrote: Each PRDT can describe up to 0x400000 bytes, and only an even number of bytes. You're dividing the transfer into chunks of 0x3FFFFF bytes, which is one less than the correct amount and not an even number.
Yes but IIRC the chunk size inside the prdt should be one less than the actual size, for transferring 512 bytes it should be 511.
I'm not sure about it I never read more than 512 bytes anyway, following your advice I should change it both where I calculate the number of prdts and where I calculate the remaining part to transfer right? (that would be line 206 and around 241)
Regards, Bonfra.
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: AHCI initialization sequence for real hardware

Post by xeyes »

Bonfra wrote:
xeyes wrote: Polling the bit set in CI is enough to detect cmd completion. No need to look at IS.
so in general the corresponding bit in PxCI should be cleared upon command competition, either it succeded or failed. This does happen In the VM but in real hw it sits there spinning like spinning with the other condition
There are at least 3 possibilities:

1. Your real hw is broken
2. This code polling CI is wrong
3. Some other code has issue(s)

IMO 3. is the most likely, I'd focus on them rather than trying to change the part that is working to accommodate the broken part.

One other thing IIRC: Both the error bits and CI have special behaviors for the 1st cmd after reset. Did the 1st cmd you issue after port reset hang or was it some subsequent ones?
Octocontrabass
Member
Member
Posts: 5587
Joined: Mon Mar 25, 2013 7:01 pm

Re: AHCI initialization sequence for real hardware

Post by Octocontrabass »

Bonfra wrote:following your advice I should change it both where I calculate the number of prdts and where I calculate the remaining part to transfer right? (that would be line 206 and around 241)
Right.

Did you ever fix this? You need to set PxCMD.FRE and make sure PxCMD.FR is set before you can set PxCMD.ST.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

xeyes wrote: 1. Your real hw is broken
highly doubt since the kernel is loaded from the very same disk I'm now trying to read + if instead of spinning I just sleep for 5 seconds or so to ensure command completion the thing is completely present in the buffer
xeyes wrote: IMO 3. is the most likely, I'd focus on them rather than trying to change the part that is working to accommodate the broken part.
I agree, this thing seems correct so maybe I missed some setup steps that should.. idk enable this thing or something
xeyes wrote: One other thing IIRC: Both the error bits and CI have special behaviors for the 1st cmd after reset. Did the 1st cmd you issue after port reset hang or was it some subsequent ones?
It hangs on the very first command I send which is an IDENTIFY, I should note that I don't perform a reset of the device, where exactly in the process should this be done? surely it has to be after I check PxSSTS for device availability since that thing is only populated after some command is executed (probably by the bios since I never touched this device before), probably before idling the port?
Regards, Bonfra.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

Octocontrabass wrote: Did you ever fix this? You need to set PxCMD.FRE and make sure PxCMD.FR is set before you can set PxCMD.ST.
so instead of that, I need this?

Code: Select all

    port->cmd |= HBA_PxCMD_FRE;
    while(!(port->cmd & HBA_PxCMD_FR))
        asm("pause");
    port->cmd |= HBA_PxCMD_ST;
Regards, Bonfra.
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: AHCI initialization sequence for real hardware

Post by xeyes »

Bonfra wrote:
xeyes wrote: 1. Your real hw is broken
highly doubt since the kernel is loaded from the very same disk I'm now trying to read + if instead of spinning I just sleep for 5 seconds or so to ensure command completion the thing is completely present in the buffer
xeyes wrote: IMO 3. is the most likely, I'd focus on them rather than trying to change the part that is working to accommodate the broken part.
I agree, this thing seems correct so maybe I missed some setup steps that should.. idk enable this thing or something
xeyes wrote: One other thing IIRC: Both the error bits and CI have special behaviors for the 1st cmd after reset. Did the 1st cmd you issue after port reset hang or was it some subsequent ones?
It hangs on the very first command I send which is an IDENTIFY, I should note that I don't perform a reset of the device, where exactly in the process should this be done? surely it has to be after I check PxSSTS for device availability since that thing is only populated after some command is executed (probably by the bios since I never touched this device before), probably before idling the port?
If you did controller reset that will effectively reset all ports as well.

It seems safer to me so I always do controller reset before touching anything else, but after bios hand off so I don't race with bios. There might be other conditions listed in spec for a reset, but IMO a reset should work at any time and should get the controller into a usable state.

The speical thing IIRC is a diag cmd (0x90) issued automatically as part of the reset protocol. It will leave some unusual status bits in the task file which won't go away until you issue your 1st cmd. Depending on what your polling code checks for, these bits may confuse the polling code so you might need a special condition for 1st cmd after reset. This process is not very well documented in either ATA or controller spec AFAIK, so the above is my own observations rather than how it should work. You might want to double check how your hw actually behaves and adjust accordingly.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

xeyes wrote: If you did controller reset that will effectively reset all ports as well.

It seems safer to me so I always do controller reset before touching anything else, but after bios hand off so I don't race with bios. There might be other conditions listed in spec for a reset, but IMO a reset should work at any time and should get the controller into a usable state.
so you are suggesting something like

Code: Select all

bios_handoff(hba);
    ahci_reset(hba);
    hba->ghc |= HBA_GHC_AE;
    hba->is = ~0u;

    for(uint8_t bit = 0; bit < 32; bit++)
        if(hba->pi & (1 << bit))
            // per port things
But this goes in conflict with the per port init code that checks for

Code: Select all

    uint32_t sig = port->sig;
    if(sig != SATA_SIG_ATA && sig != SATA_SIG_ATAPI)
        kernel_panic("ahci: Unknown PxSIG (%#X)", sig);
after a reset, the sig register is set to -1 and it is automatically repopulated only after some command is sent, this means I can't access some device information until I send a command to it, which I can't send until I have some information about it

idk its all really confusing it seems all to be working except for this final minor detail that's killing the whole thing
Regards, Bonfra.
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: AHCI initialization sequence for real hardware

Post by xeyes »

Bonfra wrote: so you are suggesting something like

idk its all really confusing it seems all to be working except for this final minor detail that's killing the whole thing
I'm just saying that's how I do it. I'm sure it's okay to do it in other ways as long as you don't deviate too much from the specs.

More to the point, it is probably better that you follow your general designs and flows rather than "killing the whole thing" just because you heard that someone (I or someone else) did things differently. As that person might have done other parts differently as well, which makes it risky for you to change only one part.
Bonfra wrote: it is automatically repopulated only after some command is sent
It is very likely that your hw also sends a 0x90 cmd automatically after port reset, if there is a drive connected. The drive will then send a response FIS for the 0x90, which sets the port signature.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

xeyes wrote: It is very likely that your hw also sends the 0x90 cmd automatically after port reset, if there is a drive connected, it will send a response FIS for the 0x90. Which will set the port signature.
I didn't test this on physical hardware but the VM leaves the sig as -1, I'm aiming to support both hardware and VMs so it's not a feasible solution for the moment, maybe it lacks something else along the way...
do you mind sharing your code so I can take some "inspiration" from it :)
Regards, Bonfra.
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: AHCI initialization sequence for real hardware

Post by xeyes »

Bonfra wrote:
xeyes wrote: It is very likely that your hw also sends the 0x90 cmd automatically after port reset, if there is a drive connected, it will send a response FIS for the 0x90. Which will set the port signature.
I didn't test this on physical hardware but the VM leaves the sig as -1, I'm aiming to support both hardware and VMs so it's not a feasible solution for the moment, maybe it lacks something else along the way...
do you mind sharing your code so I can take some "inspiration" from it :)
VM should follow a similar flow, otherwise who would send the 1st cmd? I doubt that BIOS or a kernel is supposed to issue a throwaway cmd just to get the signature set.

Maybe take a pick from the *BSD if you want to take a look at some good working code?

I've not looked myself but it is said that OpenBSD is known for rigorous (correct) code at the cost of even performance, while NetBSD is known for trying very hard to support any hw, even the half broken ones.

:lol: I don't want to inspire you the wrong way with what I have which I'm 100% sure isn't as correct as the above ones. I haven't found the motivation to look seriously at licensing and all the associated mess with it either.
Octocontrabass
Member
Member
Posts: 5587
Joined: Mon Mar 25, 2013 7:01 pm

Re: AHCI initialization sequence for real hardware

Post by Octocontrabass »

Bonfra wrote:so instead of that, I need this?

Code: Select all

    port->cmd |= HBA_PxCMD_FRE;
    while(!(port->cmd & HBA_PxCMD_FR))
        asm("pause");
    port->cmd |= HBA_PxCMD_ST;
Sorry, typo: you need to make sure PxCMD.FRE is set, not PxCMD.FR. You also need to ensure PxCMD.CR is clear, for some reason. Section 10.3.1 lists everything else you need to do first, but I think you've covered all of those.
xeyes wrote:The speical thing IIRC is a diag cmd (0x90) issued automatically as part of the reset protocol.
No commands are issued; ATA requires the drive to perform diagnostics as part of the reset procedure. This has been part of ATA since before ATA existed, so recent versions of the spec may not describe it very well, but it's clearly spelled out in ATA-1.
Bonfra wrote:I didn't test this on physical hardware but the VM leaves the sig as -1
Which VM is behaving this way? In QEMU, anything that resets the port (including resetting the entire HBA) immediately sets PxSIG according to the attached drive.

On real hardware, you have to wait for COMRESET to complete, and you may need to initiate COMRESET yourself depending on the HBA's capabilities.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

Ahci specs 10.3.1 wrote: Software shall not set PxCMD.ST to ‘1’ until it verifies that PxCMD.CR is ‘0’ and has set PxCMD.FRE to
‘1’. Additionally, software shall not set PxCMD.ST to ‘1’ until a functional device is present on the port (as
determined by PxTFD.STS.BSY = ‘0’, PxTFD.STS.DRQ = ‘0’, and (PxSSTS.DET = 3h, or PxSSTS.IPM =
2h or 6h or 8h)) and PxCLB/PxCLBU are programmed to valid values
So not only I need to set PxCMD.FRE and spin for PxCMD.CR but I also need to spin for PxTFD.STS.BSY and PxTFD.STS.DRQ to be cleared and PxSSTS.DET or PxSSTS.IPM to have some meaningful values
Octocontrabass wrote: Which VM is behaving this way? In QEMU, anything that resets the port (including resetting the entire HBA) immediately sets PxSIG according to the attached drive.
I'm testing this using qemu


EDIT
I somehow noticed that the value in sig is updated after I set the PxCMD.FRE bit so maybe I can work a bit with that. This only applies on QEMU; spinning on PxSIG after setting PxCMD.FRE in realwh hangs foreve
Regards, Bonfra.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: AHCI initialization sequence for real hardware

Post by Bonfra »

Well, I did some more research and figured that it would be a good idea to not only reset the device but reconfigure all ports. This solves the problem of getting the signature from an uninitialized port I mentioned in a previous post. This slows down quite a bit the boot process but it's something I may deal with later.
I updated once again the repo, its a mess but at least it partially works.
Now the big news: It works! Kinda. Well if after the device is initialized I sleep for like 2 seconds I can use it fine but if I don't I have two behaviors:
1) the bit in PxCI is cleared and the command is not accepted, no error is signaled in PxIS and the command is not executed but marked as if it was; later commands executed after some time will run fine. (this occurs in VMs)
2) the bit in PxCI is never cleared as if the command has been rejected, it is never executed and the bit stays there; it's as if the device hasn't yet received the HBA_PxCMD_ST bit. (this occurs in real hw)
As I said it's easily fixable by adding that delay before running any command but it isn't ideal. I think we are getting closer to making this thing work. Maybe I'm missing the final thing to spin on.
this is the delay thing I mentioned above.
Regards, Bonfra.
Post Reply