Reading from SATA/ATA/... disks (and maybe solving my PCI is

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Reading from SATA/ATA/... disks (and maybe solving my PCI is

Post by Ethin »

So, some of you guys know that I was trying to get Intel HDA going. I have decided to put that to the side for now because I at least have output via the serial port and my OS desperately needs some other things before audio.
One thing my OS is pretty much useless without is IO. I have a lot of ideas for what file systems to implement -- but I can't do that without knowing how to read from and write to disks. Sadly, the OSDev wiki is sorely lacking in this department. I see projects like Coreboot have commands like FRE, ST, CLO, etc., but have absolutely no idea where this code comes from.
So I'm not only trying to understand how disk IO works (because why write a disk IO driver if I don't understand how it works?) and implement one. The problem is, I can't find any kind of guide/tutorial/whatnot on how to do that, and the specs aren't really helping.
The main resource I've been clinging onto is code. But that has its own problems: most of the hobby OS projects that are easy to read/well documented have code that can only read from disks, or the disk implementation was never completed. I haven't looked through Minix's code yet, though I really don't want a microkernel design.
Those who know of my HDA attempt also know of my PCI issue. I'd also like to attempt to solve that while I'm at it. I'm fine with hardcoding memory addresses in code for now but I really want that issue solved so its no longer something I need to worry about until I go adding PCE.
So, my questions are:
1) What would be a good resource for learning how to read from and write to AHCI/ATA disks?
2) What would be a good OS for me to use as a sort of reference guide? (Please, as little ASM as possible.)
3) Could we try and solve the PCI problem I'm having (getting right device IDs and vendor IDs and such, but not getting any other data other than basic data) during this discussion?
If 3 isn't possible, that's fine, I can post it in another thread. I'll post my PCI handling code if anyone needs a refresher or hasn't seen it yet (I need to translate it into C first).
User avatar
zity
Member
Member
Posts: 99
Joined: Mon Jul 13, 2009 5:52 am
Location: Denmark

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by zity »

Ethin wrote:.... Sadly, the OSDev wiki is sorely lacking in this department.
I strongly disagree with this statement. I wrote fully working ATA/AHCI drivers using (almost exclusively) the wiki articles. Of course I had to look up a few details in the specs, but not very much.

ATA PIO Mode
ATA/ATAPI_using_DMA
AHCI
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Octocontrabass »

Ethin wrote:Sadly, the OSDev wiki is sorely lacking in this department.
Any suggestions for improvement on these pages?
Ethin wrote:The problem is, I can't find any kind of guide/tutorial/whatnot on how to do that, and the specs aren't really helping.
Welcome to OS development. :lol:
Ethin wrote:Could we try and solve the PCI problem I'm having (getting right device IDs and vendor IDs and such, but not getting any other data other than basic data) during this discussion?
Personally, I'd start here and worry about storage later. PCI is fundamental to modern PCs, so getting it right is important if you want to run your OS on a variety of hardware.
Ethin wrote:I'll post my PCI handling code if anyone needs a refresher or hasn't seen it yet (I need to translate it into C first).
Got an online repository we could look at?
User avatar
iansjack
Member
Member
Posts: 4705
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by iansjack »

This is one of the better documented aspects of OS development, so you really need to brush up on your search skills.
I haven't looked through Minix's code yet, though I really don't want a microkernel design.
The fact that Minix is a microkernel design is irrelevant. Learning how to control the interface doesn't depend upon the type of kernel design (although your implementation of that information, obviously, will).
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by LtG »

And I think the ATA specs aren't actually bad at all.

Also, it might be a good idea to start from the oldest spec and implement that and then skim thru the middle specs and then implement the latest.

For instance, the 386 manual is shorter and more easily understood, and still 100% valid. The newer ones in addition have to deal with all things 64-bit as well as a ton of optimizations and new instructions (which are largely optimizations).

With HDD you might want to start with PIO (although it sucks), and then move to DMA, but of course with DMA you also need to know how the DMA works, as it's separate from ATA.
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Octocontrabass »

LtG wrote:For instance, the 386 manual is shorter and more easily understood, and still 100% valid.
I wouldn't say it's 100% valid. Maybe 99%?

I agree that it's a good idea to look at older specifications to get a better understanding, especially for ATA, but make sure you look out for cases where the newer specs are no longer backwards-compatible.

If you're looking for the oldest possible ATA specification, look at the manual for the IBM PC AT Fixed Disk and Diskette Drive Adapter (available here).
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by LtG »

Octocontrabass wrote: I wouldn't say it's 100% valid. Maybe 99%?
Good point, luckily I've never been bit my that. Never tried to get away with just a JMP.
A JMP instruction should immediately follow the setting of PG.
I'm not sure what Intel is implying here, because they're only talking about strategies and not explicitly stating anything.

For instance with the MOV to SS inhibits exceptions for op-code if that op-code is MOV to ESP, to ensure stack coherency.

The way I see it, either the TLB is empty when PG is enabled, or it's been collecting (so far) useless translations based on the paging structures, but the first instruction after enabled paging would be located based on paging. I'm guessing the implication of Intel's note above is that on 386 the next instruction would already have been picked up prior to enabling PG, but would be evaluated in the context of paging being enabled, right? So it's picked up from the old physical address, though it's JMP destination is decided based on paging.

The link doesn't tell us what's in ECX at the JMP.

I think the TLB side of the article is a red herring, it's entirely to do with how the next instruction after PG enabled is picked (the JMP), based on old physical addressing, or new (based on paging).

I also don't think Intel was sloppy by not specifying PG 0 -> 1, because specifying it makes people rely on it, and there's no good reason to let people rely on it.

PS. While looking at the original manual I also noticed it had a bit about OS testing the TLB, probably something Brendan would need to do.. I'll need to check the newer manuals at some point to see if they still have that ability, and maybe implement it myself. Though I'm not aware of any cases where the CPU functions at all, yet the TLB malfunctions, so maybe not worth the effort.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Ethin »

The page I was particularly referring to was the SATA page, though ATA is also a good place to start -- I should've clarified. Apologies.
The page notes that the T13 and SATA IO specs have discrepancies and that, at the time of writing, the industry hadn't shaken out enough to determine which "specification" was "definitive". Checking the revision history of that page, I see that the last revision was from 2014; can I safely assume that this has changed?
My code is written in Rust. However, the repository link is https://github.com/ethindp/kernel. Once I figure out PCI I'll refer back to OSDev for implementing AHCI/ATAPI. Perhaps then things will make a lot more sense. :)
For reference, the PCI code is in src/pci.rs. I haven't commented it too heavily, though I need to start doing that for my own sake. :P
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Octocontrabass »

Ethin wrote:For reference, the PCI code is in src/pci.rs.
I see the problem. You have a read_word function to read from the PCI configuration space, but you call it with different expectations in different places.

First, let's pick a reasonable set of expectations: read_word will always return the u32 located at a 4-byte-aligned offset in the PCI configuration space, and it's up to the caller to filter out unwanted bits.

On line 145, the bus, slot, func, and offset parameters never use more than 8 bits, so you can define these as u8 instead of u16.

On line 151, does Rust do implicit type promotion? If not, you're doing a shift by 16 bits on a 16-bit integer, which always results in 0. You may need to do the type cast first, then perform the shift. (This will also apply to lines 152 and 153 if you redefine the parameters to u8.)

On line 158, delete all of the shifting and masking - just return inl(0xCFC).

On line 165, the vendor ID is the low 16 bits of offset 0, so use get_bits(0..16) to extract the relevant portion.

On line 169, the device ID is the high 16 bits of offset 0, so change the offset to 0 and use get_bits(16..32) to extract the relevant portion.

With that much, you should at least start seeing some reasonable vendor and device IDs - you'll need to fix your other incorrect uses of read_word and get_bits before you'll get a complete and correct list of PCI devices.
User avatar
iansjack
Member
Member
Posts: 4705
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by iansjack »

It's certainly refreshing to see someone using a language other than C for OS development. But you have to understand that this is making life more difficult for yourself as most documentation and example code will use C. To my mind it is probably more productive to use C first and then try a less usual language once you are comfortable with the basics of OS development.

If you choose Rust (which sounds like a good choice, that I am beginning to play with) be sure that you understand the language thoroughly, and that you have an equally deep understanding of C. This will allow you to read examples in C and then implement the knowledge gained in Rust. For example -something that I don't know yet - be sure that you understand how Rust lays out the equivalents of C structs in memory. That sort of thing can cause a lot of problems if you are making incorrect assumptions.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Ethin »

Thank you, Octocontrabass. I think my issue with PCI is the way the tables are presented. Since I use a screen reader, I use the control+alt+arrow keys to navigate tables (though there are more keys than just those). Let's take the PCI article, header 01h, register 04, offset 10 (BAR 0) as an example. My screen reader presents this to me like this:

Code: Select all

register | offset | bits 31-24 | bits 23-16 | bits 15-8 | bits 7-0
...
04 | 10 | Base address #0 (BAR0) | | |
That is, BAR 0 is presented to me as being bits 24 .. 31, but it could be bits 0 .. 31. I have no way of knowing, so need to guess and try to infer and just hope my inferences are correct, or ask people on here if they are.
I can't convert the page to markdown, or JSON (JSON would certainly be nice!) for the tables, which explains why 'm having all of these PCI difficulties -- I don't know which bits always correspond to which item. Some of them are easy for me to figure out, like the min grant, which, I believe, is bits 16 .. 23 of offset 3C; but others, like the BARs, are not so easy.
I hope this helps people understand my issue more clearly.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Ethin »

OK, so I've added some debugging code in my PCI probe. I apparently can now get BARs (I think they're the BARs I'm looking for, even though I get strange results...). As an example, here's what my kernel reports when probing PCI devices:
PCI: probe: found Intel Corporation 440FX - 82441FX PMC [Natoma] (Host bridge)
PCI: probe: codes: vendor = 8086h, device = 1237h, class = 6h, subclass = 0h, prog if=0h, rev=2h, status=0h, command=1h, bus = 0h, slot = 0h, function = 0h
PCI: probe: BARs (header == 0h): 0: 0h, 1: 0h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] (ISA bridge)
PCI: probe: codes: vendor = 8086h, device = 7000h, class = 6h, subclass = 1h, prog if=0h, rev=0h, status=2h, command=1h, bus = 0h, slot = 1h, function = 0h
PCI: probe: BARs (header == 0h): 0: 0h, 1: 0h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] (IDE interface)
PCI: probe: codes: vendor = 8086h, device = 7010h, class = 1h, subclass = 1h, prog if=0h, rev=0h, status=2h, command=1h, bus = 0h, slot = 1h, function = 1h
PCI: probe: BARs (header == 0h): 0: 0h, 1: 0h, 2: 0h, 3: 0h, 4: C641h, 5: 0h
PCI: probe: found Intel Corporation 82371AB/EB/MB PIIX4 ACPI (Host bridge)
PCI: probe: codes: vendor = 8086h, device = 7113h, class = 6h, subclass = 0h, prog if=0h, rev=3h, status=2h, command=1h, bus = 0h, slot = 1h, function = 3h
PCI: probe: BARs (header == 0h): 0: 0h, 1: 0h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Unknown vendor mPCIe-ICM485-2 2x Isolated RS485 PCI Express Mini Card (VGA compatible controller)
PCI: probe: codes: vendor = 1234h, device = 1111h, class = 3h, subclass = 0h, prog if=0h, rev=2h, status=0h, command=1h, bus = 0h, slot = 2h, function = 0h
PCI: probe: BARs (header == 0h): 0: FD000008h, 1: 0h, 2: FEBF4000h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Intel Corporation 82540EM Gigabit Ethernet Controller (Ethernet controller)
PCI: probe: codes: vendor = 8086h, device = 100Eh, class = 2h, subclass = 0h, prog if=0h, rev=3h, status=0h, command=1h, bus = 0h, slot = 3h, function = 0h
PCI: probe: BARs (header == 0h): 0: FEBC0000h, 1: C601h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (Audio device)
PCI: probe: codes: vendor = 8086h, device = 2668h, class = 4h, subclass = 3h, prog if=0h, rev=1h, status=0h, command=1h, bus = 0h, slot = 4h, function = 0h
PCI: probe: BARs (header == 0h): 0: FEBF0000h, 1: 0h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Intel Corporation 82801AA AC'97 Audio Controller (Multimedia audio controller)
PCI: probe: codes: vendor = 8086h, device = 2415h, class = 4h, subclass = 1h, prog if=0h, rev=1h, status=2h, command=1h, bus = 0h, slot = 5h, function = 0h
PCI: probe: BARs (header == 0h): 0: C001h, 1: C401h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
PCI: probe: found Ensoniq NV5000SC (Multimedia audio controller)
PCI: probe: codes: vendor = 1274h, device = 5000h, class = 4h, subclass = 1h, prog if=0h, rev=0h, status=4h, command=1h, bus = 0h, slot = 6h, function = 0h
PCI: probe: BARs (header == 0h): 0: C501h, 1: 0h, 2: 0h, 3: 0h, 4: 0h, 5: 0h
I finally got the right BAR for the HDA device, at least. (I'm doing no bit masking whatsoever, even though the PCI article says I need to do so.) That is the confusing part, though. It says:
Base address Registers (or BARs) can be used to hold memory addresses used by the device, or offsets for port addresses. Typically, memory address BARs need to be located in physical ram while I/O space BARs can reside at any memory address (even beyond physical memory). To distinguish between them, you can check the value of the lowest bit.
Does this mean I need to check bit zero, another bit, or a set of bits? It then goes on to say:
When you want to retrieve the actual base address of a BAR, be sure to mask the lower bits. For 16-Bit Memory Space BARs, you calculate (BAR[x] & 0xFFF0). For 32-Bit Memory Space BARs, you calculate (BAR[x] & 0xFFFFFFF0). For 64-Bit Memory Space BARs, you calculate ((BAR[x] & 0xFFFFFFF0) + ((BAR[x+1] & 0xFFFFFFFF) << 32)) For I/O Space BARs, you calculate (BAR[x] & 0xFFFFFFFC).
Besides the minor typo there, does this mean what I think it means? Let's say I'm looking at BAR 0 of the gigabit controller. That BAR is FEBC0000. It also has another BAR, C601. Seems like a 64-bit address to me. To get the actual address, would I do:

Code: Select all

((0xFEBC0000 & 0xFFFFFFF0) + ((0xC601 & 0xFFFFFFFF) << 32)) => 0xC601FEBC0000
If so, then I've finally fixed the major problem I'm having, and can now start actually working with PCI at last.
Edit: OK, a few more questions too.
1) What do I do about BARs that have holes? I.e.: the mPCIe-ICM485-2 2x Isolated RS485 PCI Express Mini Card has two bars: BAR zero, which is FD000008h, and BAR 2, FEBF4000h.
2) Why am I seeing so many devices with all six bars being all zeros?
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Octocontrabass »

Ethin wrote:I think they're the BARs I'm looking for, even though I get strange results...
Some of the strange results are due to being off-by-one in your bit ranges. In Rust, the last number in a range is not included unless it's prefixed with an equals sign, so when you write get_bits(0..7) you're only getting bits 0 through 6. You need to instead write either get_bits(0..8) or get_bits(0..=7) for all 8 bits.
Ethin wrote:I finally got the right BAR for the HDA device, at least. (I'm doing no bit masking whatsoever, even though the PCI article says I need to do so.)
That's just a coincidence. All of the bits you need to mask happen to already be zero in this case.
Ethin wrote:Does this mean I need to check bit zero, another bit, or a set of bits?
To distinguish between memory and I/O BARs, check bit zero. It's always zero for memory BARs, and always one for I/O BARs.

To distinguish between different types of memory BARs, check bits 1 and 2. The value 0 means it's a 32-bit address. The value 2 means it's a 64-bit address, with the upper 32 bits located in the next BAR. The value 1 is obsolete, but used to mean it's a 16-bit address.
Ethin wrote:Let's say I'm looking at BAR 0 of the gigabit controller. That BAR is FEBC0000. It also has another BAR, C601. Seems like a 64-bit address to me.
It's not. Bit 0 of the first BAR is 0, so it's a memory address. Bits 1 and 2 are also 0, so it's a 32-bit memory address. Since it's a 32-bit memory address, you should parse the second BAR like normal instead of treating it like upper address bits. (Incidentally, bit 0 of the second BAR is 1, which means it's an I/O address.)

Your math is correct for a 64-bit BAR, if you find any.
Ethin wrote:What do I do about BARs that have holes? I.e.: the mPCIe-ICM485-2 2x Isolated RS485 PCI Express Mini Card has two bars: BAR zero, which is FD000008h, and BAR 2, FEBF4000h.
Lots of devices share the same device ID. You need to match both the vendor ID and the device ID to find the device name. This is actually the QEMU VGA device. The specific meanings of each BAR are assigned depending on the device, so you should ignore the holes. For the QEMU VGA device, they left the hole there so they can have a 64-bit BAR 0 in the future without breaking existing drivers. If they were already using BAR 1 for something else, there would be no room for BAR 0 to be 64-bit.
Ethin wrote:2) Why am I seeing so many devices with all six bars being all zeros?
Some devices don't need them.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Ethin »

Octocontrabass wrote:
Ethin wrote:I think they're the BARs I'm looking for, even though I get strange results...
Some of the strange results are due to being off-by-one in your bit ranges. In Rust, the last number in a range is not included unless it's prefixed with an equals sign, so when you write get_bits(0..7) you're only getting bits 0 through 6. You need to instead write either get_bits(0..8) or get_bits(0..=7) for all 8 bits.
Thank you for that, that will solve that problem.
Octocontrabass wrote:
Ethin wrote:I finally got the right BAR for the HDA device, at least. (I'm doing no bit masking whatsoever, even though the PCI article says I need to do so.)
That's just a coincidence. All of the bits you need to mask happen to already be zero in this case.
How would I determine this at runtime? Would I just mask all the bits anyway?
Octocontrabass wrote:
Ethin wrote:Does this mean I need to check bit zero, another bit, or a set of bits?
To distinguish between memory and I/O BARs, check bit zero. It's always zero for memory BARs, and always one for I/O BARs.

To distinguish between different types of memory BARs, check bits 1 and 2. The value 0 means it's a 32-bit address. The value 2 means it's a 64-bit address, with the upper 32 bits located in the next BAR. The value 1 is obsolete, but used to mean it's a 16-bit address.
Thanks. I'll implement that.
Octocontrabass wrote:
Ethin wrote:Let's say I'm looking at BAR 0 of the gigabit controller. That BAR is FEBC0000. It also has another BAR, C601. Seems like a 64-bit address to me.
It's not. Bit 0 of the first BAR is 0, so it's a memory address. Bits 1 and 2 are also 0, so it's a 32-bit memory address. Since it's a 32-bit memory address, you should parse the second BAR like normal instead of treating it like upper address bits. (Incidentally, bit 0 of the second BAR is 1, which means it's an I/O address.)

Your math is correct for a 64-bit BAR, if you find any.
Again, thank you.
Octocontrabass wrote:
Ethin wrote:What do I do about BARs that have holes? I.e.: the mPCIe-ICM485-2 2x Isolated RS485 PCI Express Mini Card has two bars: BAR zero, which is FD000008h, and BAR 2, FEBF4000h.
Lots of devices share the same device ID. You need to match both the vendor ID and the device ID to find the device name. This is actually the QEMU VGA device. The specific meanings of each BAR are assigned depending on the device, so you should ignore the holes. For the QEMU VGA device, they left the hole there so they can have a 64-bit BAR 0 in the future without breaking existing drivers. If they were already using BAR 1 for something else, there would be no room for BAR 0 to be 64-bit.
I'm not really sure how to do this. I get this information by calling get_vendor_string(), get_device_string(), and get_class_string(), all of which are defined in src/pcidb.rs. Its a rust implementation of the entire PCI ID database. Its a bad idea, I know, and its only there temporarily so I don't need to repeatedly go looking up device IDs and all because my kernel will just tell me. It makes things a bit easier. (Don't look at it, its massive.)
Octocontrabass wrote:
Ethin wrote:2) Why am I seeing so many devices with all six bars being all zeros?
Some devices don't need them.
Good to know. Thank you for your extremely helpful advice.
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading from SATA/ATA/... disks (and maybe solving my PC

Post by Octocontrabass »

Ethin wrote:How would I determine this at runtime? Would I just mask all the bits anyway?
Yes.
Ethin wrote:I get this information by calling get_vendor_string(), get_device_string(), and get_class_string(), all of which are defined in src/pcidb.rs. Its a rust implementation of the entire PCI ID database.
But get_device_string() tries to identify the device using only the device ID, which means it will be wrong if more than one vendor assigns the same device ID. It needs to work like get_subclass_string(), taking both the vendor ID and the device ID as input.
Post Reply