Reading and Writing HDD

rdos · Post by **rdos** » Thu Sep 02, 2021 2:29 pm

Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.

That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.

klange · Post by **klange** » Thu Sep 02, 2021 8:16 pm

rdos wrote:That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.

I think, from a perspective of designing a complete system, this is good advice, but at the same time I think there's a viable counterargument to be made about not trying to think too much about a complete design when learning and starting out. I also think it's terribly discouraging to beginners to call them lazy for not thinking a dozen steps ahead.

Ethin · Post by **Ethin** » Fri Sep 03, 2021 1:58 pm

rdos wrote:
Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.
That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.

By this logic, one should completely design all their paging structures and their virtual memory manager before even beginning to learn the basics of OSDev. Oh, and they should also learn how to update CPU microcode and design a complete USB stack too. Even though they're just starting to learn the basics.
In other words, this is a pretty ridiculous idea unless your designing this OS professionally. But if your just starting out with aHCI or NVMe or IDE, implementing an LRU cache (for example) is overkill. A good idea, perhaps, but unnecessary baggage when your just learning how all the parts fit together and your just trying to read and write some data.

rdos · Post by **rdos** » Fri Sep 03, 2021 2:58 pm

Ethin wrote:
rdos wrote:
Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.
That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.
By this logic, one should completely design all their paging structures and their virtual memory manager before even beginning to learn the basics of OSDev. Oh, and they should also learn how to update CPU microcode and design a complete USB stack too. Even though they're just starting to learn the basics.
In other words, this is a pretty ridiculous idea unless your designing this OS professionally. But if your just starting out with aHCI or NVMe or IDE, implementing an LRU cache (for example) is overkill. A good idea, perhaps, but unnecessary baggage when your just learning how all the parts fit together and your just trying to read and write some data.

If I'm putting down a lot of time on something, then I certainly want it to be of professional quality. Why else would I bother? Just modifying some tutorial, or doing a new Linux distro is completely uninsteresting IMHO. Then I wouldn't be doing anything significant that has some use.

Besides, I didn't claim you needed to write all of that before doing a disc driver, only to consider how it would best operate with a cache. The difference is not big, and it won't take much longer to use a sector interface & aligned physical addresses instead of using a "int 0x13" type of interface. Actually, for AHCI and NVMe it's easier to write a driver that use aligned physical addresses than to write an interface that takes (a potentially) unaligned linear address.

When the disc driver is operational, you logically should write the disc cache function. That's because you want to use the disc cache interface for everything rather than accessing the physical disc directly. If you don't do it in that order, you will need to modify calls to the disc when you want to add the cache, and so adding the cache at the end is a bad idea since an efficient disc interface must have it and it takes more time when you need to rewrite stuff.

As for the VFS, if you start coding your FS directly against your physical disc, then you will need to rewrite even more when you decide you need an VFS, and so you should start with the VFS, defining your interface, and then write the functions for the FS you want to start with. Also note that you don't need to implement or define the comlete VFS before you code anything on FSes. As soon as you have defined a VFS function, you then can code it in the FS you are using.

thewrongchristian · Post by **thewrongchristian** » Fri Sep 03, 2021 5:06 pm

rdos wrote:
Besides, I didn't claim you needed to write all of that before doing a disc driver, only to consider how it would best operate with a cache. The difference is not big, and it won't take much longer to use a sector interface & aligned physical addresses instead of using a "int 0x13" type of interface. Actually, for AHCI and NVMe it's easier to write a driver that use aligned physical addresses than to write an interface that takes (a potentially) unaligned linear address.

I've got to agree with you. You should have a mind how everything is going to fit together, else your interface design decisions might be questionable. And it seems easier to design/implement top down, as your applications will be using the interface at the top of your kernel.

Personally, I designed by disk driver interface as an asynchronous read/write request interface that would be used by filesystems, and the first implementation was my ram based initrd (as loaded as a module for me by grub.) Adding an asynchronous interface to essentially memcpy seemed overkill, but I know I needed an asynchronous interface (HDD are slow!), and using the same interface with my ATA driver allowed the already debugged FS code to work with the ATA HDD with no changes. But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)

All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.

rdos · Post by **rdos** » Mon Sep 06, 2021 8:48 am

thewrongchristian wrote: Personally, I designed by disk driver interface as an asynchronous read/write request interface that would be used by filesystems, and the first implementation was my ram based initrd (as loaded as a module for me by grub.) Adding an asynchronous interface to essentially memcpy seemed overkill, but I know I needed an asynchronous interface (HDD are slow!), and using the same interface with my ATA driver allowed the already debugged FS code to work with the ATA HDD with no changes.

Exactly. If you design reasonable interfaces instead of hardcoding disc operations in the FS, then you can easily run a debugged FS on a new disc drive with no changes.

thewrongchristian wrote: But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)

Partitioning is quite simple. Instead of assuming that your FS starts at sector 0, you save the start sector (and sector count) in a partition structure. When you do a read from the partition, you check so it is within the limits (sector count), add the start sector, and send the request to the disc driver.

thewrongchristian wrote: All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.

I have a meta-data cache too (in the VFS server). I think you still need the disc cache too though.

thewrongchristian · Post by **thewrongchristian** » Mon Sep 06, 2021 12:17 pm

rdos wrote:
thewrongchristian wrote: But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)
Partitioning is quite simple. Instead of assuming that your FS starts at sector 0, you save the start sector (and sector count) in a partition structure. When you do a read from the partition, you check so it is within the limits (sector count), add the start sector, and send the request to the disc driver.

thewrongchristian wrote: All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.
I have a meta-data cache too (in the VFS server). I think you still need the disc cache too though.

I'm planning to implement virtual block devices, which can be arbitrarily stacked and provide a variety of mappings and translations, probably most similar to GEOM in FreeBSD. I'll provide at least:

Partition table support. Create a simple per partition translating virtual block device per partition (translation being the offset of the partition in the underlying device.) No caching, just translation.
Block size translating virtual block devices. A generic buffering device which converts blocks sizes. So, for example, a FS using a block size of 1024B can work on devices that provide a block size of 4096B, without having to worry about the block size of the underlying device. This type of device would provide a small amount of caching for reads/writes that are a subset of the underlying block, but little more.
A labelled virtual block device, with handling of transient underlying block devices. An example of this might be for USB media, where we might have to handle unsafely removed USB devices, which we can do by caching and queuing writes, and blocking reads, until the underlying device is plugged back in.

Other types might include RAID, transparent encryption and transparent write logging.

rdos · Post by **rdos** » Mon Sep 06, 2021 2:09 pm

thewrongchristian wrote:[*] A labelled virtual block device, with handling of transient underlying block devices. An example of this might be for USB media, where we might have to handle unsafely removed USB devices, which we can do by caching and queuing writes, and blocking reads, until the underlying device is plugged back in.[/list]

I don't think you can rely on the USB device being plugged back in. In my design, an application can continue to use cached data even if the device has been unplugged, but will get failures if it goes beyond that. It will not wait for the device to get plugged in again, as this will cause permanent blockings if it isn't.

OSDev.org

Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD

Re: Reading and Writing HDD