Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
I have this perfectly working code in vms (QEMU, vmware, vbox) but horribly behaving in real hw. Following the list in the wiki, I come up with this function to init (PCI device is given DMA and MMIO privileges in a prior code segment)
Every other function I'm not posting can be found here.
This last init_port function is the one that fails inside the second spinloop and stops the initialization of the device; I tried following other lists of things too but they all brought the same behavior, i.e. the device is not initialized.
I'm completely clueless on why this is not working I'm in your hands
The spin loop giving you trouble spins 1 million times. Is that long enough for the attached device to finish resetting? I haven't yet located the appropriate timeout for SATA, but IDE requires you to spin for at least 30 seconds!
You have many spin loops that either loop forever or break after a specific number of iterations instead of using timeouts.
Octocontrabass wrote:
The spin loop giving you trouble spins 1 million times. Is that long enough for the attached device to finish resetting? I haven't yet located the appropriate timeout for SATA, but IDE requires you to spin for at least 30 seconds!
Ok just tried with a fixed 30 seconds delay but: in qemu for some reason it detects one more (inexistent) device, in real hw the spin loop passes successfully but then every interaction with the disk just stays there forever so I think the initialization process didn't go as planned
Octocontrabass wrote:Your code to place the ports in the idle state won't work on hardware. The correct procedure is described in section 10.1.2 of the AHCI specification. (You reset the entire HBA afterwards, so this probably won't make a difference, but you might as well fix it anyway.)
I'll post again tomorrow as soon as i implement this procedure here, I hope for the best
Octocontrabass wrote:
You have many spin loops that either loop forever or break after a specific number of iterations instead of using timeouts.
If you are talking about the one in the init process I fixed them as
Bonfra wrote:Ok just tried with a fixed 30 seconds delay but: in qemu for some reason it detects one more (inexistent) device,
What's the value of PxSIG for this nonexistent device?
Bonfra wrote:in real hw the spin loop passes successfully but then every interaction with the disk just stays there forever so I think the initialization process didn't go as planned
Stays where forever?
Bonfra wrote:the one in the function to send a command never exits cause it should always be possible (for how I designed the code) to send a command
Unless the HBA hangs for some reason. And your code to send a command is broken - you can't set PxCMD.ST at the same time you set PxCMD.FRE (section 10.3.2 of the AHCI spec) and you shouldn't need to set those bits to send a command anyway because both bits should remain set between commands.
Octocontrabass wrote:
What's the value of PxSIG for this nonexistent device?
One is 0X101 and the other 0Xeb140101, so ata and atapi, I should point out that the two devices seem to be the same i.e. reading the first sector on both the same values; but maybe I'm just going crazy 'cause even reverting the code to how it was before any modification it still detects two identical devices...
Octocontrabass wrote:
Stays where forever?
It hangs inside the function to read sectors so, being the only place where infinite loops exists in this call tree, I guess its somewhere in these loops
Octocontrabass wrote:
And your code to send a command is broken - you can't set PxCMD.ST at the same time you set PxCMD.FRE (section 10.3.2 of the AHCI spec) and you shouldn't need to set those bits to send a command anyway because both bits should remain set between commands.
In general, we can assert that my code is a complete mess and should be rewritten from scratch...
So I completely trashed my code and wrote a new version following both the specs and some repos on GitHub, now it works on both emulators and real hw, I just need one last thing to call this thing 100% working.
whenever an ata command is issued by flipping the correspondent bit on the HBA_PxCI register it starts processing the command. the implementations I used as reference take an asynchronous approach here and do something else until an interrupt is raised signaling the command competition, I'd like to do this synchronously for the moment so I need something to spin on that would signal me that the command has ended. I tried with HBA_PxTFD_STS_BSY but it was cleared before the transfer buffer was filled; I also tried to spin on the very same bit I set to issue the command in HBA_PxCI but while it works as I need on VMs it hangs forever in real hw.
What bit should I spin on?
P.S. This is one of the masterpiece that helped me with the rewrite
Last edited by Bonfra on Fri Sep 02, 2022 11:41 am, edited 1 time in total.
Which bits of PxIE do you want to set once you're ready to make your driver asynchronous?
Spin on the corresponding bits of PxIS
I'm not going to make it asynchronous any time soon so I didn't even look through interrupts... is PxIS updated even if interrupts are disabled? I'm not even sure what bit is the correct one for completeness, maybe DPS?
Anyway, this is just a single bit, how do I know which command has been completed when this bit is flipped?
Bonfra wrote:is PxIS updated even if interrupts are disabled?
Yes. (Section 3.3.6 of the AHCI spec.)
Bonfra wrote:I'm not even sure what bit is the correct one for completeness, maybe DPS?
That depends on which events you care about. If you only care about command completion, you probably want PxIS.DHRS, but I think some errors don't set that bit.
Bonfra wrote:Anyway, this is just a single bit, how do I know which command has been completed when this bit is flipped?
If you're not using queued commands, check PxCI. If you're using queued commands, check PxSACT.
Bonfra wrote:Considering idk what queued commands are I think I need PxCI, so something like this?
I'm pretty sure there has been an error if you receive an interrupt notification for command completion and the command you submitted did not complete. Get rid of the goto.
I can't say I'm especially familiar with AHCI interrupts, so one of the others might be a better choice. Does your VM set any other bits in PxSI?
Bonfra wrote:Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.
Octocontrabass wrote:
I'm pretty sure there has been an error if you receive an interrupt notification for command completion and the command you submitted did not complete. Get rid of the goto.
That goto thing served as: "If a command has been completed, was the one I requested"? 'cause multiple commands could be issued at once by different threads (yes my implementation still lacks thread safety). Anyway it was just a thought for the future, as you have noticed from the repo it didn't make it to the current implementation
Bonfra wrote:Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.
Here I forgot to specify that while in realhw it spins forever, in the VM it never spins so the command is considered complete (the function returns) even if the transfer buffer is not completely filled so I can't be 100% sure about the value inside PxIS. I printed it before and after the spin on PxSI_DHRS: it always has bit 0 and sometimes (mainly on the first commands) bit 30 so task file error
I thought that the assignment operator would copy the value, or it with the argument and write it back; anyway even with that thing expanded it still behaves the same
Anyway even without the check for the command so just keeping the spin on the interrupt status it hangs in real hw.
I just updated the repo so you can check the exact loop that sits there forever. (the href points to the correct line)
Polling the bit set in CI is enough to detect cmd completion. No need to look at IS.
You can also check for errors in the polling loop by looking at the bits associated with task file errors. But AHCI in real HW seems to be very reliable, so this is probbaly not strictly needed for now.
Something i could've know if i had read the specs about interrupts is that you need to clear yourself the is register, now it 100% works in the VM but still doesn't in real hw