Random thoughts about filesystems
Posted: Tue Nov 15, 2011 5:31 pm
I've been pondering over the design of my VFS for a couple of weeks now. It works pretty much as any other operating system in the sense that it has files and files are grouped in directories. My thoughts here is not about implementation details but more about why everyone seem to be using this sort of arrangement for resources. There is something itching in the back of my head telling me that this can not be the best way to manage resources but at the same time I feel I would be an idiot thinking I could come up with something better.
I've had to go back to basics in a little thought experiment just to come to terms with what the underlying problem is. What is it I'm actually trying to achieve?
1. Need a way to say: This part of memory is a resource of some kind.
2. Need a way to reference it by labeling it in a way that both the computer and I (the user) can understand.
3. Need a way to group resources belonging to each other in some way.
4. Need a way to find a resource as fast as possible.
I can see how files and directories all achieve this. A file is a piece of memory and is labeled by an inode and a name and it is grouped using directories. They can also be found easily by traversing the tree.
Here are my thoughts on the subject though and this is where it might get laughable.
About 1. Not much to say. You need to say this part of memory is a resource. This I can't do without.
About 2. I think from a program perspective it would be enough to have a unique number for a resource like an inode but do I really need to reference resources by name for the user? Why is it so important that resources have a name I wonder. If I would have a lot of metadata about a resource I should be able to know it is the file I'm looking for without having to explicitly look at the name. As long as two resources are not allowed to have the same metadata there can never be disambiguations either.
About 3. Grouping in this case would also just be metadata so I wouldn't need directories.
About 4. I imagine finding a file using metadata could potentially be a lot faster than traversing a directory structure but referencing a certain file would be a nightmare. How would I know how much meta data I would need to supply in order to be sure I get the right file back during a lookup. My solution here would be to filter resources before executing the search. What I thought about was to create something simular to a view that works by only showing me files in the filesystem that fills a criteria i.e. having a certain type of metadata. In this view only a subset of resources are available to you. If you think about it, for most of your tasks you do you are only working on a small set of files at the same time. You are most likely not interested in the entire filesystem all at once. If you switch and work on something else you might be interested in another set of files. Is it actually necessary to have the entire tree of directories available to you the whole time?
I am gonna start doing some experiments on implementing views instead of directories. It will be interesting to see how having different views of the filesystem would work and what problems I would run into. I would need a way to define a view (what metadata is part of it) and how to list views and how to switch between views. I also need some other type of function than fopen() to open files by metadata instead of pathname. Finally I need to evalute if this works better somehow.
Commence laughing.
I've had to go back to basics in a little thought experiment just to come to terms with what the underlying problem is. What is it I'm actually trying to achieve?
1. Need a way to say: This part of memory is a resource of some kind.
2. Need a way to reference it by labeling it in a way that both the computer and I (the user) can understand.
3. Need a way to group resources belonging to each other in some way.
4. Need a way to find a resource as fast as possible.
I can see how files and directories all achieve this. A file is a piece of memory and is labeled by an inode and a name and it is grouped using directories. They can also be found easily by traversing the tree.
Here are my thoughts on the subject though and this is where it might get laughable.
About 1. Not much to say. You need to say this part of memory is a resource. This I can't do without.
About 2. I think from a program perspective it would be enough to have a unique number for a resource like an inode but do I really need to reference resources by name for the user? Why is it so important that resources have a name I wonder. If I would have a lot of metadata about a resource I should be able to know it is the file I'm looking for without having to explicitly look at the name. As long as two resources are not allowed to have the same metadata there can never be disambiguations either.
About 3. Grouping in this case would also just be metadata so I wouldn't need directories.
About 4. I imagine finding a file using metadata could potentially be a lot faster than traversing a directory structure but referencing a certain file would be a nightmare. How would I know how much meta data I would need to supply in order to be sure I get the right file back during a lookup. My solution here would be to filter resources before executing the search. What I thought about was to create something simular to a view that works by only showing me files in the filesystem that fills a criteria i.e. having a certain type of metadata. In this view only a subset of resources are available to you. If you think about it, for most of your tasks you do you are only working on a small set of files at the same time. You are most likely not interested in the entire filesystem all at once. If you switch and work on something else you might be interested in another set of files. Is it actually necessary to have the entire tree of directories available to you the whole time?
I am gonna start doing some experiments on implementing views instead of directories. It will be interesting to see how having different views of the filesystem would work and what problems I would run into. I would need a way to define a view (what metadata is part of it) and how to list views and how to switch between views. I also need some other type of function than fopen() to open files by metadata instead of pathname. Finally I need to evalute if this works better somehow.
Commence laughing.