A cache coherence problem in AHCI device?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
rdos
Member
Member
Posts: 3308
Joined: Wed Oct 01, 2008 1:55 pm

A cache coherence problem in AHCI device?

Post by rdos »

OK, so the setup of the AHCI controller works once I removed the EHCI driver which seemed to fiddle with the physical memory owned by the AHCI controller.

But I still have some kind of problem in the driver. I check that the command is completed by waiting for the command slot bit to become RESET and an additional wait for 1ms, but when I read memory contents from the IDENTIFY-DEVICE command it reads the wrong data. If I set a breakpoint just before reading the contents, and single-step in kernel-debugger, it works just fine and the sector contents are correct.

AHCI uses PCI bus-mastering, but how about cache contents in the processor? Will the processor cache become updated once the AHCI has transfered the contents or will there be a need for other methods to ensure that the correct data is read?

It could also be that I don't check for completion in the correct way, and this is why the data is not correct.

What speaks against a cache problem is that the Realtek NIC-driver (especially the 8169 driver) also uses PCI bus-mastering and doesn't seem to have these issues.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: A cache coherence problem in AHCI device?

Post by gerryg400 »

rdos wrote:Will the processor cache become updated once the AHCI has transfered the contents or will there be a need for other methods to ensure that the correct data is read?
I've always imagined that you would need to turn off caching on the pages used for DMA. The Intel 3A manual in the section called 11.3.2 Choosing a Memory Type seems slightly vague. Perhaps some testing with various caching settings is in order.

http://kerneltrap.org/mailarchive/linux ... 29/1657814 says
Regular memory is cache coherent, and DMA (with a few very special cases as exception that are beyond the scope of this document) is cache coherent with the CPU on a PC. PCI MMIO regions and other similar pieces of device memory are NOT cache coherent.
That seems pretty clear. Are you using DMA or is this MMIO ?

Keep up the good work on AHCI.
If a trainstation is where trains stop, what is a workstation ?
rdos
Member
Member
Posts: 3308
Joined: Wed Oct 01, 2008 1:55 pm

Re: A cache coherence problem in AHCI device?

Post by rdos »

I think I understand the reason. The size field in the PRD was set to 200h, which means 513 bytes instead of 512. That meant that PRD ready interrupt never fired. Now that I've corrected the size, the PRD-bit in pPxIS is set. I suppose this is the bit that should trigger "command ready".
rdos
Member
Member
Posts: 3308
Joined: Wed Oct 01, 2008 1:55 pm

Re: A cache coherence problem in AHCI device?

Post by rdos »

The specification is not very clear about when read data is supposed to be valid, neither does it supply status-flags in the tables that clearly defines things as "ready". What it provides is a PRD-ready interrupt mask & status flag, but this flag does not point to a specific command-list, rather must be interpreted as a port-wide variable, which more or less is unusable if more than one command-list is used. The other alternative is to watch "bytes transfered", which indeed is in the command-list structure, and thus is per-command and not per-port. The problem with "bytes transfered" is that there is no field "bytes to transfer" in the command list, but only a number of PRDs. In order to calculate "bytes to transfer" it is necesary to traverse the PRD table, adding up on the sizes there. This is not acceptable to do in an IRQ.

OTOH, there are 4 Dwords "reserved" in the command list, and one of them could be used by the OS as "bytes to transfer". However, the spec claims that reserved fields should be set to zero. I can understand the rational for this in the MMIO of the HBA-device, but not for the command lists layed out in ordinary RAM, which the OS constructs. However, the spec says nothing in particular about the use of reserved locations in command lists, but I think a fair guess is that these could be used by the OS, at least for the moment, as this would be far more effective than putting required extra-fields elsewhere.

I also have my doubts about the safety of using the PxCI field to allocate a command-list. This variable is in the MMIO-space of AHCI-controller, and thus probably is not multicore safe. Additionally, the clearing of a command-list bit in PxCI is only an indication that the AHCI-controller has finished processing an entry, not that the OS device-driver is ready with it. Therefore, I think the OS device-driver needs to keep it's own mask of active command-lists, which it updates in response to allocations, and finished commands.
Post Reply