Page 1 of 1

I've hit my wits end on an ATA problem

Posted: Mon Mar 21, 2011 7:26 pm
by thomasluce
Every year or so I venture into building an OS, and every year I make it a bit further than the year before. It is both a wonderful learning experience, and an exercise in frustration. But I do enjoy it!

This year I've made it all the way to accessing harddrives(!) and I've run into a problem that I've spent the last month trying to debug, but I'm in it to win it at this point. I've debugged the hell out of it, tried to fix it every way I know how, and even just re-written it from scratch a few times. The symptom at this point is that when I try to read from a drive I get a drive fault. The trick of it is that it's virtual hardware... IDENTIFY (and all the fiddling with that) works fine, so I know that things work in general. I'm stuck. Like I said, I've been pushing this rock for a month, and I've come to it: asking for help.

Please, oh people who have done this before me. Help. :mrgreen:

Here's the code where the problem is (as best I can find):

Code: Select all

static u32int ata_read_lba28(drive_info_t drive, u32int start_sector, u8int sector_count, void *buffer) {
        // Do some bit-twiddling, and set up the address.
        u8int magic = 0xE0 | drive.drive;
        outb(drive.controller + ATA_DRIVE_PORT, magic | ((start_sector >> 24) & 0x0F));
        outb(drive.controller + ATA_ERROR_PORT, 0);

        // Send the sector count
        outb(drive.controller + ATA_SECTOR_COUNT_PORT, sector_count);

        // And the address, (lo, mid, high)
        outb(drive.controller + ATA_SECTOR_NUM_PORT, (u8int)((start_sector >>  0) & 0xFF));
        outb(drive.controller + ATA_LOW_PORT,        (u8int)((start_sector >>  8) & 0xFF));
        outb(drive.controller + ATA_HIGH_PORT,       (u8int)((start_sector >> 16) & 0xFF));

        // Send the read command
        outb(drive.controller + ATA_COMMAND_PORT, ATA_CMD_READ_PIO);

        u8int sector = 0;
        u32int total = 0;
        for(sector = 0; sector < sector_count; sector++) {
                // Wait till we are ready
                u8int result = ata_poll(drive);
                if(result == 2) {
                        ata_print_error(drive);
                        return total;
                } else if(result == 1) {
                        monitor_write("Drive failure.\n");
                        return total;
                } else if(result == 3) {
                        monitor_write("DRQ not set...\n");
                        return total;
                }

                // Get 256 bytes (one sector's worth) of data.
                u32int i = 0;
                for(i = 0; i < 128; i++) {
                        u16int temp = inw(drive.controller + ATA_DATA_PORT);
                        monitor_write_hex(temp); monitor_put(' ');
                        ((u16int*)buffer)[total] = temp;
                        total++;
                }
        }
        return total;
}
It dies in the "else if(result == 1)" bit. The poll function looks like this:

Code: Select all

static u8int ata_poll(drive_info_t drive) { 
        u16int port = drive.controller + ATA_COMMAND_PORT;
        inb(port);
        inb(port);
        inb(port);
        inb(port);
    
        while((inb(port) & ATA_STATUS_BSY));
    
        // Get the status register
        u8int status = inb(drive.controller + ATA_COMMAND_PORT);
        if(status & ATA_STATUS_ERR) return 2;           // Describable error
        if(status & ATA_STATUS_DF) return 1;            // Device fault
        if(!(status & ATA_STATUS_DRQ)) return 3;        // DRQ should be set
        return 0;
}
All of it is run after disabling interrupts on the controller. Any help. Any insight. Any ideas. Just hearing another person have a thought about it might help. I need a fresh perspective because I feel like I've got tunnel vision on the issue at this point. I'm using qemu 0.12.5, on ubuntu 10.10. I'm trying to read a single sector from the device starting at sector 0 (though I've tried it with other values as well).

Insights, flames, and haikus welcome.

Thanks!

Re: I've hit my wits end on an ATA problem

Posted: Tue Mar 22, 2011 8:56 am
by thomasluce
Good point. Thanks. It doesn't explain the device failure, though. I get that on the read of the first byte.

Re: I've hit my wits end on an ATA problem

Posted: Wed Mar 23, 2011 4:00 pm
by thomasluce
So here's a weird thing: When I don't IDENTIFY the drive first, I can read data just fine. However, I do get valid IDENTIFY information back when I do. Here is my code for that:

Code: Select all

static void identify_drive(u16int controller, u8int drive, drive_info_t *drive_info) {
        // Select the drive
        outb(controller + ATA_DRIVE_PORT, drive);

        // Set the LBAlo, LBAmid, LBAhi ports to zero
        outb(controller + ATA_SECTOR_NUM_PORT, 0);
        outb(controller + ATA_LOW_PORT, 0);
        outb(controller + ATA_HIGH_PORT, 0);

        // Ask for some info
        outb(controller + ATA_COMMAND_PORT, ATA_CMD_IDENTIFY);

        // Read back a status
        u8int result = ata_poll(*drive_info);

        // Check that it is actually an ATA drive (and not some other ATAPI
        // device that I don't care about at the moment)
        if(inb(controller + ATA_SECTOR_NUM_PORT) != 0 || inb(controller + ATA_LOW_PORT) != 0) return;

        // We are a real, honest to jebus, ATA drive. Wait for ATA_STATUS_DRQ or ATA_STATUS_ERR.
        while(1) {
                result = inb(controller + ATA_COMMAND_PORT);
                if(!(result & ATA_STATUS_BSY) && (result & ATA_STATUS_DRQ)) break;
                if((result & ATA_STATUS_ERR)) {
                        ata_print_error(*drive_info);
                        return;
                }
        }

        // If we are here, then we are ready to read some data!
        u16int i;
        for(i = 0; i < 256; i++) {
                drive_info->identity[i] = inw(controller + ATA_DATA_PORT);
        }

        if(drive_info->identity[ATA_IDENT_MAX_LBA_EXT]) {
                // Note that this OS is a 32-Bit Operating System, So last 2 Words are ignored.
                drive_info->address_mode = LBA48;
                drive_info->size         = drive_info->identity[ATA_IDENT_MAX_LBA_EXT];
        } else if(drive_info->identity[ATA_IDENT_MAX_LBA]) {
                // Note that this OS is a 32-Bit Operating System, So last 2 Words are ignored.
                drive_info->address_mode = LBA28;
                drive_info->size         = drive_info->identity[ATA_IDENT_MAX_LBA];
        } else {
                drive_info->address_mode = CHS;
                u16int cylinders = drive_info->identity[1];
                u16int heads     = drive_info->identity[3];
                u16int sectors   = drive_info->identity[6];
                drive_info->size = cylinders * heads * sectors;
        }
}
Hrm... Well, that's encouraging in a way. Has anyone ran across a similar issue, or have any insight from a different angle?

Thanks again!
-Thomas

Re: I've hit my wits end on an ATA problem

Posted: Wed Mar 23, 2011 6:57 pm
by bewing
To do a proper IDENTIFY, you should definitely also set ATA_SECTOR_COUNT_PORT to 0 -- and maybe ATA_ERROR_PORT too, just for fun.

And it doesn't look like you are sending the proper "magic bits" to ATA_DRIVE_PORT. It should be 0xA0 for the master, or 0xB0 for the slave -- probably not just "drive".

And you should probably also test or print the actual value of "result" that you get back from ata_poll().

Re: I've hit my wits end on an ATA problem

Posted: Thu Mar 24, 2011 3:18 pm
by thomasluce
Hey, thanks bewing. I set those things to zero (from what I've read nothing really pays attention to that, but you're right, I should do it for the sake of correctness). Drive is either 0xA0 or 0xB0 (sorry I didn't post the header file), but good eyes none-the-less.

As for the polling, I realized that the requirements for completeness are actually a bit different, so I just took out that call and in-lined the functionality. It got things "more correct", but now I am in a situation where not IDENTIFY'ing the drive doesn't solve my original problem. I guess that's good news; I'm doing the same wrong thing everywhere, at least.

Thanks again!
-Thomas