Page 1 of 1

Problems with SATA

Posted: Wed Nov 17, 2021 10:18 pm
by jwhitehorn
So, in short, I'm having a lot of problems with SATA. I've read over the spec, and the samples in the wiki, and I'm not positive what I'm missing. Sadly, this caused me to shelf my OS project for over a year - it's not fun beating your head against a wall.

What I'm seeing is this:

I can probe for AHCI devices, and identify them by their signature. I can even identify the SATA controller. I intiallize it by attempting to rebase it, which appears successful. However, when I attempt to read from it, that's when things go sideway. This is also the part where I'm certain I'm missing something.

Here is a sample of the boot log when my OS is starting:

Code: Select all

probing PCI...
PCI: 0:0.0: 8086:1237: class: 6.0 (Bridge device) irq: 0
PCI: 0:1.0: 8086:7000: class: 6.1 (Bridge device) irq: 0
PCI: 0:1.1: 8086:7010: class: 1.1 (Storage controller) irq: 0
PCI: 0:1.3: 8086:7113: class: 6.80 (Bridge device) irq: 9
PCI: 0:2.0: 1234:1111: class: 3.0 (Display controller) irq: 0
PCI: 0:3.0: 8086:2922: class: 1.6 (Storage controller) irq: 11 bar5: febf1000
Intel ICH9 controller found (bus=0, slot=3, func=0, abar=0xfebf1000)
 HBA in legacy mode
SATA device detected:
   port[1].sig = ffffffff
   ipm=0, spd=0, det=3
   rebasing port...DONE
here A
lower ptr=0000000000400000
upper ptr=0000000000000000
A.1
ptr: 0000000000400000
lock '(null)':
 ffffffff80106f87 ffffffff80108858 ffffffff8010869e ffffffff80100da7 ffffffff80100775 ffffffff8010057f ffffffff8010a44e ffffffff8010a502 ffffffff8010a838 ffffffff8010abd5


PANIC on cpu 0
 acquire
STACK:
 [0] ffffffff801016eb
 [1] ffffffff80106f47
 [2] ffffffff80108ae5
 [3] ffffffff8010886a
 [4] ffffffff8010869e
 [5] ffffffff80100da7
 [6] ffffffff80100775
 [7] ffffffff8010057f
 [8] ffffffff8010a44e
 [9] ffffffff8010a502
HLT
The full source for my AHCI work is here: https://github.com/jwhitehorn/xv64/blob ... nel/ahci.c

But, the bits for reading from the device are not checked in (because they're all hacked up and failing). The logic looks like this:

Code: Select all

void ahci_sata_init(HBA_PORT *port, int num){
	if(ahci_rebase_port(port,num) > 0) {
		//TODO: rest of init
        uint16 buf[16];
        int success = ahci_sata_read(port, 0, 0, 1, &buf[0]);
        if(success == 1){
            cprintf("SUCCESS READING SATA!\n");
        }else{
            cprintf("BOO!!!!!!\n");
        }
	}
}

int ahci_sata_read(HBA_PORT *port, uint32 startl, uint32 starth, uint32 count, uint16 *buf) {

	port->is = (uint32) -1;               // Clear pending interrupt bits
	int spin = 0; // Spin lock timeout counter
	int slot = ahci_find_cmdslot(port);
	if (slot == -1)
		return 0;

    cprintf("here A\n");
    cprintf("lower ptr=%p\n", port->clb);
    cprintf("upper ptr=%p\n", port->clbu);
	HBA_CMD_HEADER *cmdheader = (HBA_CMD_HEADER*) (
          ((uint64)port->clbu << 32) + port->clb
      );
	cmdheader += slot;
    cprintf("A.1\n");
    cprintf("ptr: %p\n", cmdheader);
	cmdheader->cfl = sizeof(FIS_REG_H2D)/sizeof(uint32);  // Command FIS size
    cprintf("A.2\n");
	cmdheader->w = 0;               // Read from device
    cprintf("A.3\n");
	cmdheader->prdtl = (uint16)((count-1)>>4) + 1;        // PRDT entries count

    cprintf("here B\n");
	HBA_CMD_TBL *cmdtbl = (HBA_CMD_TBL*)((uint64)cmdheader->ctba);
	memset(cmdtbl, 0, sizeof(HBA_CMD_TBL) +
	       (cmdheader->prdtl-1)*sizeof(HBA_PRDT_ENTRY));

    cprintf("here C\n");
    int i = 0;
	// 8K bytes (16 sectors) per PRDT
	for (i=0; i<cmdheader->prdtl-1; i++){
		cmdtbl->prdt_entry[i].dba = (uint32) *buf;
		cmdtbl->prdt_entry[i].dbc = 8*1024-1;   // 8K bytes (this value should always be set to 1 less than the actual value)
		cmdtbl->prdt_entry[i].i = 1;
		buf += 4*1024;  // 4K words
		count -= 16;    // 16 sectors
	}
    cprintf("here D\n");
	// Last entry
	cmdtbl->prdt_entry[i].dba = (uint32) *buf;
	cmdtbl->prdt_entry[i].dbc = (count<<9)-1;       // 512 bytes per sector
	cmdtbl->prdt_entry[i].i = 1;

    cprintf("here E\n");
	// Setup command
	FIS_REG_H2D *cmdfis = (FIS_REG_H2D*)(&cmdtbl->cfis);

	cmdfis->fis_type = FIS_TYPE_REG_H2D;
	cmdfis->c = 1;  // Command
	cmdfis->command = ATA_CMD_READ_DMA_EX;

	cmdfis->lba0 = (uint8)startl;
	cmdfis->lba1 = (uint8)(startl>>8);
	cmdfis->lba2 = (uint8)(startl>>16);
	cmdfis->device = 1<<6;  // LBA mode

	cmdfis->lba3 = (uint8)(startl>>24);
	cmdfis->lba4 = (uint8)starth;
	cmdfis->lba5 = (uint8)(starth>>8);

	cmdfis->countl = count & 0xFF;
	cmdfis->counth = (count >> 8) & 0xFF;

    cprintf("here F\n");
	// The below loop waits until the port is no longer busy before issuing a new command
	while ((port->tfd & (ATA_DEV_BUSY | ATA_DEV_DRQ)) && spin < 1000000) {
		spin++;
	}
	if (spin == 1000000) {
		cprintf("Port is hung\n");
		return 0;
	}
    cprintf("here G\n");

	port->ci = 1<<slot;     // Issue command

	// Wait for completion
	while (1){
		// In some longer duration reads, it may be helpful to spin on the DPS bit
		// in the PxIS port field as well (1 << 5)
		if ((port->ci & (1<<slot)) == 0)
			break;
		if (port->is & HBA_PxIS_TFES){   // Task file error
			cprintf("Read disk error\n");
			return 0;
		}
	}
    cprintf("here H\n");

	// Check again
	if (port->is & HBA_PxIS_TFES){
		cprintf("Read disk error\n");
		return 0;
	}
    cprintf("here I\n");
	return 1;
}
I appreciate any/all help you can provide, as I'd like to get this project moving again.

Re: Problems with SATA

Posted: Thu Nov 18, 2021 3:37 am
by Klakap
port[1].sig = ffffffff
Signature of hard disk is 0x00000101. You are probably reading from non exiting device.

Re: Problems with SATA

Posted: Thu Nov 18, 2021 6:32 am
by jwhitehorn
Klakap wrote:
port[1].sig = ffffffff
Signature of hard disk is 0x00000101. You are probably reading from non exiting device.
I too thought the signature was odd, but I feel like I've been over that code so many times.

If I don't try to read from the disk, here is the full startup list:
Intel ICH9 controller found (bus=0, slot=3, func=0, abar=0xfebf1000)
HBA in legacy mode
SATA device detected:
port[1].sig = ffffffff
ipm=0, spd=0, det=3
rebasing port...DONE
SATA device detected:
port[16].sig = f0458b48
ipm=0, spd=8, det=3
rebasing port...FAILED
SATA device detected:
port[27].sig = d0014804
ipm=1, spd=8, det=1
rebasing port...FAILED
SATA device detected:
port[31].sig = e47589e8
ipm=5, spd=8, det=3
rebasing port...FAILED
As you can see, none of the other devices either have a signature of 0x101, nor do they succeed in rebasing.

My probing logic is here:

Code: Select all

void ahci_try_setup_known_device(char *dev_name, uint64 ahci_base_mem, uint16 bus, uint16 slot, uint16 func) {
	cprintf("%s controller found (bus=%d, slot=%d, func=%d, abar=0x%x)\n", dev_name, bus, slot, func, ahci_base_mem);

	HBA_MEM *ptr = (HBA_MEM *)&ahci_base_mem;
	cprintf(" HBA in ");
	if(ptr->ghc == 0x0) {
		cprintf("legacy mode\n");
	}else{
		cprintf("AHCI-only mode\n");
	}

	uint64 pi = ptr->pi;
	for(int i = 0; (i != 32); i++) {
		uint64 port_mask = 1 << i;
		if((pi & port_mask) == 0x0) {
			continue;
		}

		HBA_PORT *hba_port = (HBA_PORT *) &ptr->ports[i];

		if(hba_port->sig != SATA_SIG_ATAPI && hba_port->sig != SATA_SIG_SEMB && hba_port->sig != SATA_SIG_PM) {
			//we may have found a SATA device, but what is the status of this device?
			uint64 ssts = hba_port->ssts;

			uint8 ipm = (ssts >> 8) & 0x0F;
			uint8 spd = (ssts >> 4) & 0x0F;
			uint8 det = ssts & 0x7; //the Device Detection (DET) flags are the bottom 3 bits

			if (det != HBA_PORT_DET_PRESENT && ipm != HBA_PORT_IPM_ACTIVE) {
				//nope
			}else if(hba_port->sig==SATA_SIG_ATAPI) {
				//ATAPI device
			}else if(hba_port->sig==SATA_SIG_SEMB) {

			}else if(hba_port->sig==SATA_SIG_PM) {
				//port multiplier detected
			}else{
				cprintf("SATA device detected:\n");
				cprintf("   port[%d].sig = %x\n", i, hba_port->sig);
				cprintf("   ipm=%x, spd=%x, det=%x\n", ipm, spd, det);
				ahci_sata_init(hba_port, i);
			}
		}
	}
}
I'd appreciate any concrete tips on where you think that might be wrong, but again I've read the spec up and down and fail to see how the probing could be wrong. I hope I'm incorrect.

Re: Problems with SATA

Posted: Thu Nov 18, 2021 10:21 pm
by Ethin
Are you, perhaps, resetting the AHCI controller? I've found that if you tell the controller to reset you lose all the information that was in the registers (the value you saw is the default value when the controller comes out of reset). I'm not exactly sure myself what needs to be done to get that information back, so I usually just don't reset the controller.

Re: Problems with SATA

Posted: Thu Nov 18, 2021 10:31 pm
by Octocontrabass

Re: Problems with SATA

Posted: Thu Nov 18, 2021 11:03 pm
by jwhitehorn
Are you sure?

The address in this example (0xfebf1000) is stored in an integer. I want a pointer to the address 0xfebf1000, not a pointer to the integer holding 0xfebf1000 - so wouldn't I need the & operator there?

Also, if I remove it, it crashes the moment it tries to pull anything out of that pointer:
Intel ICH9 controller found (bus=0, slot=3, func=0, abar=0xfebf1000)
HBA in lock '(null)':
ffffffff80106fff ffffffff801088d0 ffffffff80108716 ffffffff8010057f ffffffff8010a4c6 ffffffff8010a57a ffffffff8010a8b0 ffffffff8010ac4d ffffffff8010ac69 ffffffff80104ece


PANIC on cpu 0
acquire
STACK:
[0] ffffffff80101763
[1] ffffffff80106fbf
[2] ffffffff80108b5d
[3] ffffffff801088e2
[4] ffffffff80108716
[5] ffffffff8010057f
[6] ffffffff8010a4c6
[7] ffffffff8010a57a
[8] ffffffff8010a8b0
[9] ffffffff8010ac4d
HLT

Re: Problems with SATA

Posted: Thu Nov 18, 2021 11:22 pm
by nullplan
jwhitehorn wrote: The address in this example (0xfebf1000) is stored in an integer. I want a pointer to the address 0xfebf1000, not a pointer to the integer holding 0xfebf1000 - so wouldn't I need the & operator there?
The logic is backward. By placing the address-of operator you are getting a pointer to that integer holding the base address, when you wanted a pointer to the base address.
jwhitehorn wrote:Also, if I remove it, it crashes the moment it tries to pull anything out of that pointer:
I don't know your memory space setup, but it looks as if you are using the ABAR directly as a pointer. The ABAR can only possibly contain a physical address, so unless you have paging disabled, you need to map it.

Re: Problems with SATA

Posted: Fri Nov 19, 2021 7:35 am
by jwhitehorn
nullplan wrote:The logic is backward. By placing the address-of operator you are getting a pointer to that integer holding the base address, when you wanted a pointer to the base address.
Thank you, both! You're right - I did have this backwards. I appreciate you pushing back. I'm not sure why I was turned around, maybe because it so obviously was broken otherwise.
nullplan wrote: I don't know your memory space setup, but it looks as if you are using the ABAR directly as a pointer. The ABAR can only possibly contain a physical address, so unless you have paging disabled, you need to map it.
I'm not 100% positive what you mean here. I'm using this as a learning excerise, so there's a LOT of concepts I'm still catching up on. Do you have reference (like a page in the wiki) that talks about this? Specifically I'm not certain if paging is enabled or disabled, as I've not yet messed around with any of the memory management logic in this OS - it's a fork of MIT's Xv6 that I'm using to help better understand all these concepts.

EDIT:
Nevermind, I found a page talking about paging in 64-bit mode, and I see PAE is enabled, so you're 100% right.

Now I just need to learn about mapping this so I can actually access it.

Re: Problems with SATA

Posted: Fri Nov 19, 2021 9:04 am
by jwhitehorn
Just a quick update;

Octocontrabass & nullplan were correct, the problem was a combination of an incorrect pointer and failing to correctly map the address of the ABAR.

This one commit fixes both of these: https://github.com/jwhitehorn/xv64/comm ... 0cdc100bce

As a result, the startup logs now look like this:

Code: Select all

Intel ICH9 controller found (bus=0, slot=3, func=0, abar=0xfebf1000)
 HBA in AHCI-only mode
SATA device detected:
   port[0].sig = 101
   ipm=1, spd=1, det=3
   rebasing port...DONE
HERElock '(null)':
 ffffffff80106e98 ffffffff80108769 ffffffff801085af ffffffff80100cb8 ffffffff80100782 ffffffff8010057f ffffffff8010a35f ffffffff8010a413 ffffffff8010a749 ffffffff8010aae6


PANIC on cpu 0
 acquire
STACK:
 [0] ffffffff801015fc
 [1] ffffffff80106e58
 [2] ffffffff801089f6
 [3] ffffffff8010877b
 [4] ffffffff801085af
 [5] ffffffff80100cb8
 [6] ffffffff80100782
 [7] ffffffff8010057f
 [8] ffffffff8010a35f
 [9] ffffffff8010a413
HLT


The device signature is 0x101 (just like Klakap said it should be). It's still failing when it tries to read from the device, but I'm much more confident that I'm actually on the right path. Also, from debugging I know the error lies here:

Code: Select all

	HBA_CMD_HEADER *cmdheader = (HBA_CMD_HEADER*) (
	          ((uint64)port->clbu << 32) + port->clb
	      );
	cmdheader += slot;
	cmdheader->cfl = sizeof(FIS_REG_H2D)/sizeof(uint32);	// Command FIS size
I need to do more research, but I suspect this is similar to be problem I was having with the ABAR - simply, the address is either goofed up because I'm in 64-bit land, or because of (a lack of) memory mapping.

Either way, thanks everyone for your help!!

Re: Problems with SATA

Posted: Fri Nov 19, 2021 10:04 am
by jwhitehorn
jwhitehorn wrote: I need to do more research, but I suspect this is similar to be problem I was having with the ABAR - simply, the address is either goofed up because I'm in 64-bit land, or because of (a lack of) memory mapping.
Another update;

That's exactly what it was, both actually.

The offending line needed mapping (and a parenthesis, and also the same when getting the pointer to HBA_CMD_TBL.

Code: Select all

	HBA_CMD_HEADER *cmdheader = (HBA_CMD_HEADER*) P2V(
	          ( ((uint64)port->clbu) << 32) + port->clb
	      );
I'm now successfully reading from SATA!!! Thanks again, everyone.