Page 1 of 3

AHCI initialization sequence for real hardware

Posted: Thu Sep 01, 2022 3:37 pm
by Bonfra
I have this perfectly working code in vms (QEMU, vmware, vbox) but horribly behaving in real hw. Following the list in the wiki, I come up with this function to init (PCI device is given DMA and MMIO privileges in a prior code segment)

Code: Select all

static void init_device(volatile hba_mem_t* hba)
{
    enable_ahci_mode(hba);
    bios_handoff(hba);
    idle_ports(hba);
    ahci_reset(hba);
    enable_ahci_mode(hba);

    for(uint8_t bit = 0; bit < 32; bit++)
        if(hba->pi & (1 << bit)) // bit is set device exists
        {
            volatile hba_port_t* port = &hba->ports[bit];

            if(!init_port(port))
                continue;

            // some more stuff
        }
}
where init_port is

Code: Select all

static bool init_port(volatile hba_port_t* port)
{
    uint64_t page0 = (uint64_t)pfa_calloc(1);

    uint64_t cmd_address = page0;
    uint32_t cmd_low = cmd_address & 0xFFFFFFFFLL;
    uint32_t cmd_high = (cmd_address >> 32) & 0xFFFFFFFFLL;

    uint64_t fis_address = page0 + 1024;
    uint32_t fis_low = fis_address & 0xFFFFFFFFLL;
    uint32_t fis_high = (fis_address >> 32) & 0xFFFFFFFFLL;

    port->clb = cmd_low;
    port->clbu = cmd_high;

    port->fb = fis_low;
    port->fbu = fis_high;

    port->cmd |= HBA_PxCMD_FRE;
    port->cmd |= HBA_PxCMD_SUD;

    pit_prepare_one_shot(1);
    pit_perform_one_shot();

    uint64_t spin = 0;
    while((port->ssts & HBA_PxSSTS_DET) != 3 && spin < 1000000)
        spin++;

    if(spin >= 1000000)
        return false;

    port->serr = 0xFFFFFFFF;

    spin = 0;
    while(port->tfd & HBA_PxTFD_STS_DRQ && port->tfd & HBA_PxTFD_STS_BSY && spin < 1000000)
        spin++;

    if(spin >= 1000000)
        return false;

    port->is = 0;

    volatile hba_cmd_header_t* cmd_header = ptr(cmd_address);
    uint64_t page1 = (uint64_t)pfa_calloc(1);
    uint64_t page2 = (uint64_t)pfa_calloc(1);

    for (uint8_t i = 0; i < 32; i++)
    {
        cmd_header[i].prdtl = 1;
        cmd_header[i].pmp = 0;
        cmd_header[i].c = true;
        cmd_header[i].b = false;
        cmd_header[i].r = false;
        cmd_header[i].p = false;
        cmd_header[i].a = false;

        uint64_t cmd_addr = i < 16 ? page1 : page2;
        cmd_addr += 256 * (i % 16);

        uint32_t cmd_addr_low = cmd_addr & 0xFFFFFFFFull;
        uint32_t cmd_addr_high = (cmd_addr >> 32) & 0xFFFFFFFFull;

        cmd_header[i].ctba = cmd_addr_low;
        cmd_header[i].ctbau = cmd_addr_high;
    }

    return true;
}
Every other function I'm not posting can be found here.

This last init_port function is the one that fails inside the second spinloop and stops the initialization of the device; I tried following other lists of things too but they all brought the same behavior, i.e. the device is not initialized.

I'm completely clueless on why this is not working I'm in your hands

Re: AHCI initialization sequence for real hardware

Posted: Thu Sep 01, 2022 4:42 pm
by Octocontrabass
Your code to place the ports in the idle state won't work on hardware. The correct procedure is described in section 10.1.2 of the AHCI specification. (You reset the entire HBA afterwards, so this probably won't make a difference, but you might as well fix it anyway.)

The spin loop giving you trouble spins 1 million times. Is that long enough for the attached device to finish resetting? I haven't yet located the appropriate timeout for SATA, but IDE requires you to spin for at least 30 seconds!

You have many spin loops that either loop forever or break after a specific number of iterations instead of using timeouts.

Re: AHCI initialization sequence for real hardware

Posted: Thu Sep 01, 2022 5:25 pm
by Bonfra
Octocontrabass wrote: The spin loop giving you trouble spins 1 million times. Is that long enough for the attached device to finish resetting? I haven't yet located the appropriate timeout for SATA, but IDE requires you to spin for at least 30 seconds!
Ok just tried with a fixed 30 seconds delay but: in qemu for some reason it detects one more (inexistent) device, in real hw the spin loop passes successfully but then every interaction with the disk just stays there forever so I think the initialization process didn't go as planned
Octocontrabass wrote:Your code to place the ports in the idle state won't work on hardware. The correct procedure is described in section 10.1.2 of the AHCI specification. (You reset the entire HBA afterwards, so this probably won't make a difference, but you might as well fix it anyway.)
I'll post again tomorrow as soon as i implement this procedure here, I hope for the best
Octocontrabass wrote: You have many spin loops that either loop forever or break after a specific number of iterations instead of using timeouts.
If you are talking about the one in the init process I fixed them as

Code: Select all

    t0 = time_sinceepoch();
    while(condition)
        if(time_sinceepoch() - t0 > TIMEOUT)
            return false;
the one in the function to send a command never exits cause it should always be possible (for how I designed the code) to send a command

Re: AHCI initialization sequence for real hardware

Posted: Thu Sep 01, 2022 6:39 pm
by Octocontrabass
Bonfra wrote:Ok just tried with a fixed 30 seconds delay but: in qemu for some reason it detects one more (inexistent) device,
What's the value of PxSIG for this nonexistent device?
Bonfra wrote:in real hw the spin loop passes successfully but then every interaction with the disk just stays there forever so I think the initialization process didn't go as planned
Stays where forever?
Bonfra wrote:the one in the function to send a command never exits cause it should always be possible (for how I designed the code) to send a command
Unless the HBA hangs for some reason. And your code to send a command is broken - you can't set PxCMD.ST at the same time you set PxCMD.FRE (section 10.3.2 of the AHCI spec) and you shouldn't need to set those bits to send a command anyway because both bits should remain set between commands.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 6:46 am
by Bonfra
Octocontrabass wrote: What's the value of PxSIG for this nonexistent device?
One is 0X101 and the other 0Xeb140101, so ata and atapi, I should point out that the two devices seem to be the same i.e. reading the first sector on both the same values; but maybe I'm just going crazy 'cause even reverting the code to how it was before any modification it still detects two identical devices...
Octocontrabass wrote: Stays where forever?
It hangs inside the function to read sectors so, being the only place where infinite loops exists in this call tree, I guess its somewhere in these loops
Octocontrabass wrote: And your code to send a command is broken - you can't set PxCMD.ST at the same time you set PxCMD.FRE (section 10.3.2 of the AHCI spec) and you shouldn't need to set those bits to send a command anyway because both bits should remain set between commands.
In general, we can assert that my code is a complete mess and should be rewritten from scratch...

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 7:11 am
by devc1
It is normal, I get a second ATAPI Device on QEMU too.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 10:59 am
by Bonfra
So I completely trashed my code and wrote a new version following both the specs and some repos on GitHub, now it works on both emulators and real hw, I just need one last thing to call this thing 100% working.
whenever an ata command is issued by flipping the correspondent bit on the HBA_PxCI register it starts processing the command. the implementations I used as reference take an asynchronous approach here and do something else until an interrupt is raised signaling the command competition, I'd like to do this synchronously for the moment so I need something to spin on that would signal me that the command has ended. I tried with HBA_PxTFD_STS_BSY but it was cleared before the transfer buffer was filled; I also tried to spin on the very same bit I set to issue the command in HBA_PxCI but while it works as I need on VMs it hangs forever in real hw.
What bit should I spin on?

P.S.
This is one of the masterpiece that helped me with the rewrite

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 11:36 am
by Octocontrabass
Bonfra wrote:What bit should I spin on?
Which bits of PxIE do you want to set once you're ready to make your driver asynchronous?

Spin on the corresponding bits of PxIS.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 11:47 am
by Bonfra
Octocontrabass wrote:
Bonfra wrote:What bit should I spin on?
Which bits of PxIE do you want to set once you're ready to make your driver asynchronous?
Spin on the corresponding bits of PxIS
I'm not going to make it asynchronous any time soon so I didn't even look through interrupts... is PxIS updated even if interrupts are disabled? I'm not even sure what bit is the correct one for completeness, maybe DPS?
Anyway, this is just a single bit, how do I know which command has been completed when this bit is flipped?

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 1:03 pm
by Octocontrabass
Bonfra wrote:is PxIS updated even if interrupts are disabled?
Yes. (Section 3.3.6 of the AHCI spec.)
Bonfra wrote:I'm not even sure what bit is the correct one for completeness, maybe DPS?
That depends on which events you care about. If you only care about command completion, you probably want PxIS.DHRS, but I think some errors don't set that bit.
Bonfra wrote:Anyway, this is just a single bit, how do I know which command has been completed when this bit is flipped?
If you're not using queued commands, check PxCI. If you're using queued commands, check PxSACT.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 1:28 pm
by Bonfra
Octocontrabass wrote: If you're not using queued commands, check PxCI. If you're using queued commands, check PxSACT.
Considering idk what queued commands are I think I need PxCI, so something like this?

Code: Select all

spin:
while(!(port->is & HBA_PxSI_DHRS))
    asm("pause");
if(port->ci & (1 << cmd_slot))
    goto spin;
Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.

I just updated the repo so you can check the exact loop that sits there forever. (the href points to the correct line)

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 2:48 pm
by Octocontrabass
Bonfra wrote:Considering idk what queued commands are I think I need PxCI, so something like this?
I'm pretty sure there has been an error if you receive an interrupt notification for command completion and the command you submitted did not complete. Get rid of the goto.

I can't say I'm especially familiar with AHCI interrupts, so one of the others might be a better choice. Does your VM set any other bits in PxSI?
Bonfra wrote:Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.
This means there's still something wrong with your setup. I'm not sure exactly what, but I spotted two problems: you shouldn't use a read-modify-write operation to set bits in PxCI (section 5.5.1 of the AHCI spec), and you have several off-by-one errors in your PRDT length calculation.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 3:07 pm
by Bonfra
Octocontrabass wrote: I'm pretty sure there has been an error if you receive an interrupt notification for command completion and the command you submitted did not complete. Get rid of the goto.
That goto thing served as: "If a command has been completed, was the one I requested"? 'cause multiple commands could be issued at once by different threads (yes my implementation still lacks thread safety). Anyway it was just a thought for the future, as you have noticed from the repo it didn't make it to the current implementation :P
Bonfra wrote:Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.
Here I forgot to specify that while in realhw it spins forever, in the VM it never spins so the command is considered complete (the function returns) even if the transfer buffer is not completely filled so I can't be 100% sure about the value inside PxIS. I printed it before and after the spin on PxSI_DHRS: it always has bit 0 and sometimes (mainly on the first commands) bit 30 so task file error
Octocontrabass wrote: you shouldn't use a read-modify-write operation to set bits in PxCI (section 5.5.1 of the AHCI spec)
I thought that the assignment operator would copy the value, or it with the argument and write it back; anyway even with that thing expanded it still behaves the same
Can you expand on this thing here? I'm not sure what you are talking about

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 3:13 pm
by xeyes
Bonfra wrote:
Octocontrabass wrote: If you're not using queued commands, check PxCI. If you're using queued commands, check PxSACT.
Considering idk what queued commands are I think I need PxCI, so something like this?

Code: Select all

spin:
while(!(port->is & HBA_PxSI_DHRS))
    asm("pause");
if(port->ci & (1 << cmd_slot))
    goto spin;
Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.

I just updated the repo so you can check the exact loop that sits there forever. (the href points to the correct line)
Polling the bit set in CI is enough to detect cmd completion. No need to look at IS.

You can also check for errors in the polling loop by looking at the bits associated with task file errors. But AHCI in real HW seems to be very reliable, so this is probbaly not strictly needed for now.

Re: AHCI initialization sequence for real hardware

Posted: Fri Sep 02, 2022 3:21 pm
by Bonfra
Something i could've know if i had read the specs about interrupts is that you need to clear yourself the is register, now it 100% works in the VM but still doesn't in real hw
Image