Filesystem interface desgin
Posted: Mon Oct 23, 2006 10:51 pm
OK, i have a fairly functional VFS design which supports both memory file systems and actual files systems, it even supports namespaces where things can be not in the traditional unix style fs, but still accessible through URL like prefixes (ex. "devices://mouse" or "processes://12453").
I have come across something i dont like. EXT2/3, my reference disk based fs implementation is designed such that to get a given directory entry, you need and inode and an index within that inode. First you locate the inode on disk, which will result in an "array" of dentries which are not equal size (size determined by reclen member of each element). Then you loop reading each entry until you have found the element at your index.
This bothers me because if you want to get a given element, say the 5th entry in a directory, i need to read the 4 before it to get there. This can result in a lot of redundant reads from disk!
So, I was looking at how Linux might deal with it, it seems that perhaps it does not..at least not directly. the primary interface is "getdents" which also results an array of non-uniform directory entries. This at least gives me the impression that Linux returns whole directory listings at a time to user space, leaving it up to user space to parse and store this data efficiently if needed.
So my question is really, how would you guys organize around this seemingly inefficient technique? I am toying with having my FileSystem objects return std::list<DirectoryEntry> objects (I have a fairly con formant libstdc++/libc in kernel land ) This could easily by passed up to user space efficiently given they want the whole directory listing at a time, and would make iterating a given directory listing easy to the caller, just use a list iterator.
Keep in mind that I am talking mostly about when the kernel talks to the VFS, not user space, if i can do it efficiently in the kernel, i can give this information to user space efficiently as well i believe.
So the first thing that comes to mind as a fault with this idea is what if there is like a LOT of entries in the directory? Is it worth creating this huge list then? caching would help with redundant reads and such, but may be an issue once I have more complete write support at least in the added complexity area. As far as caching, I am thinking I could store a certain amount of megabytes worth of previous std::list objects which any write to the associated inode would invalidate instantly (causing a refetch from disk). When I hit the upper limit of the cache, oldest gets pushed out. I feel this would be fairly efficient given a decent hit-rate.
Just looking for some thoughts
proxy
I have come across something i dont like. EXT2/3, my reference disk based fs implementation is designed such that to get a given directory entry, you need and inode and an index within that inode. First you locate the inode on disk, which will result in an "array" of dentries which are not equal size (size determined by reclen member of each element). Then you loop reading each entry until you have found the element at your index.
This bothers me because if you want to get a given element, say the 5th entry in a directory, i need to read the 4 before it to get there. This can result in a lot of redundant reads from disk!
So, I was looking at how Linux might deal with it, it seems that perhaps it does not..at least not directly. the primary interface is "getdents" which also results an array of non-uniform directory entries. This at least gives me the impression that Linux returns whole directory listings at a time to user space, leaving it up to user space to parse and store this data efficiently if needed.
So my question is really, how would you guys organize around this seemingly inefficient technique? I am toying with having my FileSystem objects return std::list<DirectoryEntry> objects (I have a fairly con formant libstdc++/libc in kernel land ) This could easily by passed up to user space efficiently given they want the whole directory listing at a time, and would make iterating a given directory listing easy to the caller, just use a list iterator.
Keep in mind that I am talking mostly about when the kernel talks to the VFS, not user space, if i can do it efficiently in the kernel, i can give this information to user space efficiently as well i believe.
So the first thing that comes to mind as a fault with this idea is what if there is like a LOT of entries in the directory? Is it worth creating this huge list then? caching would help with redundant reads and such, but may be an issue once I have more complete write support at least in the added complexity area. As far as caching, I am thinking I could store a certain amount of megabytes worth of previous std::list objects which any write to the associated inode would invalidate instantly (causing a refetch from disk). When I hit the upper limit of the cache, oldest gets pushed out. I feel this would be fairly efficient given a decent hit-rate.
Just looking for some thoughts
proxy