Reading and Writing HDD

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reading and Writing HDD

Post by rdos »

Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.
That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.
klange
Member
Member
Posts: 679
Joined: Wed Mar 30, 2011 12:31 am
Libera.chat IRC: klange
Discord: klange

Re: Reading and Writing HDD

Post by klange »

rdos wrote:That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.
I think, from a perspective of designing a complete system, this is good advice, but at the same time I think there's a viable counterargument to be made about not trying to think too much about a complete design when learning and starting out. I also think it's terribly discouraging to beginners to call them lazy for not thinking a dozen steps ahead.
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Reading and Writing HDD

Post by Ethin »

rdos wrote:
Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.
That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.
By this logic, one should completely design all their paging structures and their virtual memory manager before even beginning to learn the basics of OSDev. Oh, and they should also learn how to update CPU microcode and design a complete USB stack too. Even though they're just starting to learn the basics.
In other words, this is a pretty ridiculous idea unless your designing this OS professionally. But if your just starting out with aHCI or NVMe or IDE, implementing an LRU cache (for example) is overkill. A good idea, perhaps, but unnecessary baggage when your just learning how all the parts fit together and your just trying to read and write some data.
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reading and Writing HDD

Post by rdos »

Ethin wrote:
rdos wrote:
Ethin wrote: This is exactly what I was trying to say. Don't implement a cache when your just trying to read/write to disk. Once you've got an FS and partition layer in place, then worry about caching.
That's an pretty inefficient method since it means you quite likely will need to do major updates to the disc drivers (and partition & FS handling) since you were lazy to begin with and didn't plan the design. Both the disc cache and the VFS interface comes before doing anything FS related or partition related.
By this logic, one should completely design all their paging structures and their virtual memory manager before even beginning to learn the basics of OSDev. Oh, and they should also learn how to update CPU microcode and design a complete USB stack too. Even though they're just starting to learn the basics.
In other words, this is a pretty ridiculous idea unless your designing this OS professionally. But if your just starting out with aHCI or NVMe or IDE, implementing an LRU cache (for example) is overkill. A good idea, perhaps, but unnecessary baggage when your just learning how all the parts fit together and your just trying to read and write some data.
If I'm putting down a lot of time on something, then I certainly want it to be of professional quality. Why else would I bother? Just modifying some tutorial, or doing a new Linux distro is completely uninsteresting IMHO. Then I wouldn't be doing anything significant that has some use.

Besides, I didn't claim you needed to write all of that before doing a disc driver, only to consider how it would best operate with a cache. The difference is not big, and it won't take much longer to use a sector interface & aligned physical addresses instead of using a "int 0x13" type of interface. Actually, for AHCI and NVMe it's easier to write a driver that use aligned physical addresses than to write an interface that takes (a potentially) unaligned linear address.

When the disc driver is operational, you logically should write the disc cache function. That's because you want to use the disc cache interface for everything rather than accessing the physical disc directly. If you don't do it in that order, you will need to modify calls to the disc when you want to add the cache, and so adding the cache at the end is a bad idea since an efficient disc interface must have it and it takes more time when you need to rewrite stuff.

As for the VFS, if you start coding your FS directly against your physical disc, then you will need to rewrite even more when you decide you need an VFS, and so you should start with the VFS, defining your interface, and then write the functions for the FS you want to start with. Also note that you don't need to implement or define the comlete VFS before you code anything on FSes. As soon as you have defined a VFS function, you then can code it in the FS you are using.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Reading and Writing HDD

Post by thewrongchristian »

rdos wrote:
Besides, I didn't claim you needed to write all of that before doing a disc driver, only to consider how it would best operate with a cache. The difference is not big, and it won't take much longer to use a sector interface & aligned physical addresses instead of using a "int 0x13" type of interface. Actually, for AHCI and NVMe it's easier to write a driver that use aligned physical addresses than to write an interface that takes (a potentially) unaligned linear address.
I've got to agree with you. You should have a mind how everything is going to fit together, else your interface design decisions might be questionable. And it seems easier to design/implement top down, as your applications will be using the interface at the top of your kernel.

Personally, I designed by disk driver interface as an asynchronous read/write request interface that would be used by filesystems, and the first implementation was my ram based initrd (as loaded as a module for me by grub.) Adding an asynchronous interface to essentially memcpy seemed overkill, but I know I needed an asynchronous interface (HDD are slow!), and using the same interface with my ATA driver allowed the already debugged FS code to work with the ATA HDD with no changes. But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)

All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reading and Writing HDD

Post by rdos »

thewrongchristian wrote: Personally, I designed by disk driver interface as an asynchronous read/write request interface that would be used by filesystems, and the first implementation was my ram based initrd (as loaded as a module for me by grub.) Adding an asynchronous interface to essentially memcpy seemed overkill, but I know I needed an asynchronous interface (HDD are slow!), and using the same interface with my ATA driver allowed the already debugged FS code to work with the ATA HDD with no changes.
Exactly. If you design reasonable interfaces instead of hardcoding disc operations in the FS, then you can easily run a debugged FS on a new disc drive with no changes.
thewrongchristian wrote: But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)
Partitioning is quite simple. Instead of assuming that your FS starts at sector 0, you save the start sector (and sector count) in a partition structure. When you do a read from the partition, you check so it is within the limits (sector count), add the start sector, and send the request to the disc driver.
thewrongchristian wrote: All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.
I have a meta-data cache too (in the VFS server). I think you still need the disc cache too though.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Reading and Writing HDD

Post by thewrongchristian »

rdos wrote:
thewrongchristian wrote: But I still haven't done any partitioning, which I plan to put in place once I have my USB storage driver finished (read sectors last night!)
Partitioning is quite simple. Instead of assuming that your FS starts at sector 0, you save the start sector (and sector count) in a partition structure. When you do a read from the partition, you check so it is within the limits (sector count), add the start sector, and send the request to the disc driver.
thewrongchristian wrote: All that said, my cache operates at the VFS/vnode level, so filesystem meta-data is read from the disk driver using a temporary buffer rather than semi-persistent cache memory.
I have a meta-data cache too (in the VFS server). I think you still need the disc cache too though.
I'm planning to implement virtual block devices, which can be arbitrarily stacked and provide a variety of mappings and translations, probably most similar to GEOM in FreeBSD. I'll provide at least:
  • Partition table support. Create a simple per partition translating virtual block device per partition (translation being the offset of the partition in the underlying device.) No caching, just translation.
  • Block size translating virtual block devices. A generic buffering device which converts blocks sizes. So, for example, a FS using a block size of 1024B can work on devices that provide a block size of 4096B, without having to worry about the block size of the underlying device. This type of device would provide a small amount of caching for reads/writes that are a subset of the underlying block, but little more.
  • A labelled virtual block device, with handling of transient underlying block devices. An example of this might be for USB media, where we might have to handle unsafely removed USB devices, which we can do by caching and queuing writes, and blocking reads, until the underlying device is plugged back in.
Other types might include RAID, transparent encryption and transparent write logging.
rdos
Member
Member
Posts: 3297
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reading and Writing HDD

Post by rdos »

thewrongchristian wrote:[*] A labelled virtual block device, with handling of transient underlying block devices. An example of this might be for USB media, where we might have to handle unsafely removed USB devices, which we can do by caching and queuing writes, and blocking reads, until the underlying device is plugged back in.[/list]
I don't think you can rely on the USB device being plugged back in. In my design, an application can continue to use cached data even if the device has been unplugged, but will get failures if it goes beyond that. It will not wait for the device to get plugged in again, as this will cause permanent blockings if it isn't.
Post Reply