Page 1 of 1

How should file system drivers buffer disk data structures?

Posted: Thu Oct 06, 2011 10:06 am
by tomos
In researching ext2 and a VFS layer I was a bit confused over how an OS should buffer on-disk data structures, such as inode tables in an ext2 block group. On-disk data structures being those physically present on the storage medium as part of the file system implementation. And by buffering I mean keeping a copy of the structure in memory for sometime after it is initially accessed. I read that Linux keeps some of them in memory, but I didn't see specifically which ones and how they were handled.

So say you need to read inode X, which of the on-disk data structures for the file system would you want copied to memory after the inode has been read? If you do copy it, how long does it stay there? If you do buffer them, how so?

Take listing a directory for example. First you need to open() the directory. When you open a directory the low-level ext2 code should find the inode, and perhaps read all its data blocks into a temporary buffer which stays allocated for the lifetime of the directory handle (it being opened.) Then subsequent calls to read a directory entry simply reference the already allocated buffer for the directory entries. Closing a directory would involve free the buffer which stored the directories data blocks. Does this approach make sense?

What do you guys think?

Re: How should file system drivers buffer disk data structur

Posted: Thu Oct 06, 2011 3:31 pm
by gerryg400
I asked these questions too. Rather than worry too much I just jumped in and started implementing an VFS, fs driver and HDD driver with no caches. Then I began doing profiling to see what would be required. If your abstractions are good (make sure that they are !!) then it shouldn't be too hard to add caching later on.

There were 2 aspects that were a surprise to me.

1. It worked better for me if the block cache size matched the FS block rather than the HDD block. In other words the cache beneath my ext2 driver matches my ext2 block size (usually 4096 bytes) rather than the hardware block size (512 bytes).

2. A name cache is very important. And the name cache should also contain names that DON'T exist. For example let's say you're running 'make' and the GCC tools. GCC , when searching for an include file on the include paths might call 'stat' many times before it finds the directory that contains "stdio.h". The VFS should remember which files DON'T exist so it doesn't need to search every directory on the include path for every include file in every C file. So the name cache might have info that says that /usr/include/stdio.h has inode 99999 but that /usr/include/X11.h doesn't exist. This makes a huge difference.

Re: How should file system drivers buffer disk data structur

Posted: Thu Oct 06, 2011 7:31 pm
by OSwhatever
gerryg400 wrote:2. A name cache is very important. And the name cache should also contain names that DON'T exist. For example let's say you're running 'make' and the GCC tools. GCC , when searching for an include file on the include paths might call 'stat' many times before it finds the directory that contains "stdio.h". The VFS should remember which files DON'T exist so it doesn't need to search every directory on the include path for every include file in every C file. So the name cache might have info that says that /usr/include/stdio.h has inode 99999 but that /usr/include/X11.h doesn't exist. This makes a huge difference.
Speaking of the name cache the holds the name that don't exist, does this feature exist in most modern operating systems or just a way you solved it? If you as you mentioned make has a lot of directories with a lot of files, the don't exist cache will quickly fill. At some point perhaps seeking a hashed name cache with the existing files be faster?

Re: How should file system drivers buffer disk data structur

Posted: Thu Oct 06, 2011 7:57 pm
by gerryg400
The concept exists in Linux although I don't know how it is implemented. Ihave no idea about windows. I read in an article that Linux does store non-existant entries and that's where I got the idea.

Directory searches are very expensive on most file systems so ideally the name cache needs to be large enough to hold every path/file that might be searched. On a typical desktop this could easily mean thousands of entries.

The article that I read (can't find it now) mentioned that the Linux dcache is hashed by filename and inode of containing directory. This seems like a good idea because (hard and soft) linking and moving can mean that top-livel pathnames can change and it would be very difficult to keep the cache valid if the entire name were hashed.