VFS as a simple dictionary

NickJohnson · Post by **NickJohnson** » Sun May 02, 2010 1:42 pm

I've been working the last few days on my VFS, and it's surprisingly hard to get working. It's not even what most people consider a full VFS: the only metadata stored is the pid of the file's driver and its inode number, so the querying process can ask the driver directly for full information about the file. My main problem has been with text processing (I haven't implemented strtok yet, but C isn't great for text processing anyway). I've been storing everything as a big n-ary tree, which is not that fast, because each directory is a linked list of subdirectories/files that aren't in any order.

So, I realized, why not use a simple dictionary algorithm instead of a tree of inodes? A trie would give absolute optimal space and time usage (asymptotically), and take only a few minutes to implement fully. The VFS wouldn't even have to keep track of whether things were directories or files: only drivers can modify it, so inconsistency (e.g. /bin/echo and /bin/echo/ls both existing) would be the filesystem driver's fault. It shouldn't be harder to implement mountpoints either: just have separate tries for each device, and splice them at the appropriate points.

Does anyone see a problem with implementing a VFS like this? Performance implications? Better algorithms than a trie for this kind of dictionary, that can still be spliced at relative positions?

JamesM · Post by **JamesM** » Sun May 02, 2010 2:20 pm

NickJohnson wrote:I've been working the last few days on my VFS, and it's surprisingly hard to get working. It's not even what most people consider a full VFS: the only metadata stored is the pid of the file's driver and its inode number, so the querying process can ask the driver directly for full information about the file. My main problem has been with text processing (I haven't implemented strtok yet, but C isn't great for text processing anyway). I've been storing everything as a big n-ary tree, which is not that fast, because each directory is a linked list of subdirectories/files that aren't in any order.

So, I realized, why not use a simple dictionary algorithm instead of a tree of inodes? A trie would give absolute optimal space and time usage (asymptotically), and take only a few minutes to implement fully. The VFS wouldn't even have to keep track of whether things were directories or files: only drivers can modify it, so inconsistency (e.g. /bin/echo and /bin/echo/ls both existing) would be the filesystem driver's fault. It shouldn't be harder to implement mountpoints either: just have separate tries for each device, and splice them at the appropriate points.

Does anyone see a problem with implementing a VFS like this? Performance implications? Better algorithms than a trie for this kind of dictionary, that can still be spliced at relative positions?

Hi,

I've pretty much been down this road. The VFS I designed in Pedigree was going to use a radix tree (a space-compressed trie) and treat the entire filesystem as a large string search space. One lookup.

The problem is with inherited UNIX-style permissions which are difficult to cache on a per-file basis, and symbolic links. Because of symlinks there is not a 1:1 mapping between string keys and file inodes, rather it is a many-to-one mapping.

Off the top of my head the main problem I encountered was cache entry invalidation. Permissions of a directory get changed, or a directory entry is removed; all entries for children of that directory need to be updated with the new information (whatever it is; assuming nothing about your implementation but at least some hereditary attributes are almost always present in a VFS).

This means you have to find every key that points to any inode below the current directory in the VFS tree, which is now tricky because, remember, you have a n:1 mapping of these keys.

You could always keep track of the mapping list for each inode. However that list is potentially infinite. Paths such as "/a/./././././b" and "/a/b" are identical, and although you can always strip "./" and "../" in a quick string preprocessing stage, one can always symlink a file to . or .., and play the same game with that.

So you have an infinite list, which you obviously can't generate eagerly because it's infinite. But you run into problems generating it lazily too, because it can be potentially huge (and do you garbage collect it? or let it grow?)

I came to the conclusion that with symlinks available it wouldn't work at all; that the only option is to parse the string segment by segment. This is what I chose for the final design of Pedigree's VFS, with the segment-name lookup being done with a radix tree (O(k) where k is the length of the key and not prone to the bloat that tries are).

James

NickJohnson · Post by **NickJohnson** » Sun May 02, 2010 3:14 pm

I see what you're saying, but it seems to only apply to a VFS that caches permissions - mine doesn't. The filesystem drivers are the only things that modify the VFS, and they can keep track of permissions and cache them internally however they want. That means I never have to care about updating permissions recursively within a directory.

It definitely wouldn't work to have symlinks be stored as all possible strings that run though the symlink (as you described), but I'm not sure it would be impossible to have symlinks altogether. Couldn't symlinks be specially-marked nodes that cause a text substitution partway though the search? For example, the search "/foo/bar/baz" on a tree where "/foo/bar" is a symlink to "/a" would be searched as "/foo/bar/baz", then when it hits "/foo/bar", the search would be re-run as "/a/baz" ("/a", the symlink + "/baz", the remaining search). This would only require looking ahead one character (for a / or '\0') when encountering a potential symlink.

JamesM · Post by **JamesM** » Tue May 04, 2010 3:14 am

NickJohnson wrote:I see what you're saying, but it seems to only apply to a VFS that caches permissions - mine doesn't. The filesystem drivers are the only things that modify the VFS, and they can keep track of permissions and cache them internally however they want. That means I never have to care about updating permissions recursively within a directory.

It definitely wouldn't work to have symlinks be stored as all possible strings that run though the symlink (as you described), but I'm not sure it would be impossible to have symlinks altogether. Couldn't symlinks be specially-marked nodes that cause a text substitution partway though the search? For example, the search "/foo/bar/baz" on a tree where "/foo/bar" is a symlink to "/a" would be searched as "/foo/bar/baz", then when it hits "/foo/bar", the search would be re-run as "/a/baz" ("/a", the symlink + "/baz", the remaining search). This would only require looking ahead one character (for a / or '\0') when encountering a potential symlink.

Hi,

Sorry I didn't reply to this earlier. Yes, that would work. However, you'd then need to perform a potential symlink check every time you encounter a '/' character, which is exactly the same as a per-directory recursive technique, is it not? At least in complexity.

James

NickJohnson · Post by **NickJohnson** » Tue May 04, 2010 5:33 pm

You could make each node have a pointer to a string that holds the symlink's link. If it's null, you know it's not a symlink, you don't have to check for a '/'; otherwise, you check for a '/' to see whether you are invoking the symlink or accessing a file with the same prefix.

I don't understand how always checking for a '/' would make it as complex as a directory tree though: directory trees still require searching a linked list (or a radix tree in your case, which is faster but yet more complicated) of their contents at every recursion, but this pure trie/radix tree implementation is relatively simple and uniform at each step.

Edit: actually, I'm also thinking that my original idea for symlinks might work too: instead of doing a text substitution, which creates a chunk of memory that needs to be freed at some point, a symlink could simply be a pointer to a different position in the tree, where the search should be continued. It would make the tree no longer a tree (by adding cycles), but would be significantly faster and remove local state by not requiring copied strings. It would also be possible to implement .. and . with symlinks then, although it would still probably be cleaner in a preprocessing stage.

NickJohnson · Post by **NickJohnson** » Tue May 04, 2010 7:33 pm

Okay, I'm pretty sure the pointer technique works fine. I made a simple implementation with an interactive test loop. It supports symlinks of any level perfectly, and proper freeing of removed nodes (afaik - it at least doesn't free stuff it shouldn't). The code is completely uncommented and messy, but you can take a look at it if you want.

OSDev.org

VFS as a simple dictionary

VFS as a simple dictionary

Re: VFS as a simple dictionary

Re: VFS as a simple dictionary

Re: VFS as a simple dictionary

Re: VFS as a simple dictionary

Re: VFS as a simple dictionary