Page 1 of 1
Storage device stack and volume caches
Posted: Sat Nov 21, 2009 6:09 pm
by Gigasoft
In my OS, I am going to expose block-based interfaces for storage devices, where whole sectors are always read or written, as this makes the storage drivers simpler. This interface gets translated into a byte based interface which acts like a file that can be randomly accessed, which in the case of a hard disk is then split into partitions which also act as files, and this is what's used by file system drivers to access the storage device. But somewhere along the path, a cache has to be implemented and I can't decide on where.
It could be placed below the storage device's interface, inside the random access translation layer or below it, or below the interfaces for the individual partitions, or implemented as part of the file systems. What would be the optimal place to have it?
Re: Storage device stack and volume caches
Posted: Sat Nov 21, 2009 6:50 pm
by NickJohnson
I think caches for block devices are usually on the block level, below the filesystem and pseudo-random-access layer. That way, the block device can do DMA directly to the entries in the cache, but the granularity is reasonably small (compared to caching whole partitions).
Re: Storage device stack and volume caches
Posted: Tue Dec 01, 2009 5:11 am
by mybura
What kind of access patterns are you expecting ?
Large vs. Small Blocks
Random vs. Sequential
Keep in mind that the drive also has a cache most effectively used on sequential access.
Re: Storage device stack and volume caches
Posted: Thu Dec 03, 2009 3:07 am
by pcmattman
The design of caches is an interesting concept, with various solutions. I'm surprised this topic hasn't been picked up and discussed further.
Our cache works at the disk level: all disks (aka block devices in your design) work with a page-sized cache. Blocks a page long are read and written, which comes in handy when working on sequential reads. The design of this cache is relatively flexible as we are able to modify the size of cache blocks, should we need to based on profiling results.
The best piece of advice I can give that's explicitly targeted to your situation is to build with flexibility. Perhaps implement the cache at the FS layer as well as the block device layer and support switching them at compile-time with a define or something. Support fine-tuning of the cache block size. Then profile, profile, profile. Profiling across various types of block devices fast and slow will help you make the right choices in your cache implemtation.
Note: I know this is a very late reply to this question, but I think it's a topic worth discussing further.
Re: Storage device stack and volume caches
Posted: Thu Dec 03, 2009 5:49 am
by Brendan
Hi,
Well, let me be the first to have the opposite opinion - put all caching (e.g. file data and directory entries) at the *highest* level (in the VFS), and avoid the need for the VFS to ask the file system to ask the storage device driver when the data is already cached...
If you cache data at the highest level anyway, then most of the time caching blocks at the lowest level would be a waste of time (there's no point caching the same data twice).
For example, imagine if FAT32 file system code opens the file "/dev/hda2" as read/write. Typical file sharing rules imply that if something is opened as read/write then nothing else can open it as read-only or write-only; and if the FAT32 file system code's data is cached at the highest level then anything that is cached at the lowest level will never be needed.
This doesn't always apply though. For an example, you could have "/dev/cdrom" that's opened as read-only by ISO9660 file system code, and the user could open this file as read-only a second time (e.g. to copy the disk to a "*.iso" file). In this case caching at the lowest level might be useful, but for whole disk copies you'd probably end up with no real benefit anyway. For example, if the cache uses the "least recently used" algorithm and you do "copy /dev/cdrom foo.iso" several times, then new data will push old data out of the cache before it's used, and you'd get no benefit (and wasted RAM and more overhead). Maybe it might give you some benefit with a different algorithm ("least recently used" isn't perfect and has problems in cases like this). Mostly, getting it right (so that there's actually a benefit from caching at the lowest level) would be more hassle than it's worth.
The other situation where caching at the lowest level can make some sense is if you don't treat the lowest level as "blocks". For example, what if you want to write 6 bytes at "Logical Byte Address" 0x123456789ABCDEF? The storage device driver can work out that this is actually 6 bytes at offset 0x1EF in the sector at "Logical Block Address" 0x91A2B3C4D5E6; and could read the entire sector, then modify those 6 bytes, then write the entire sector back to disk. This adds overhead, and (IMHO) would justify a small block cache in the storage device driver. It might also sound a little unnecessary, until you start writing file system code that works with disk images - e.g. a FAT12 disk image stored on a CD-ROM (where the file system code expects 512-byte sectors but the actual device uses 2048-byte sectors). It might also be a very interesting foundation for a much more efficient file system, where disk space isn't lost due to partially used sectors.
Cheers,
Brendan