Page 2 of 2
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 3:08 am
by davidv1992
@Brendan: I have got a few more further questions regarding your approach: How do you handle access permissions on directories in the presence of symbolic links? Second, how do you handle diretories where the user has access/read permissions on the directory itself, but not on all of the parents on the path it used to get there? Third, how do you keep the hash table up to date efficiently when the user starts modifying symbolic links?
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 8:29 am
by Brendan
Hi,
davidv1992 wrote:@Brendan: I have got a few more further questions regarding your approach: How do you handle access permissions on directories in the presence of symbolic links? Second, how do you handle diretories where the user has access/read permissions on the directory itself, but not on all of the parents on the path it used to get there?
For all operations; the access permissions at the "final target" are the only permissions that matter. If you try to delete a symbolic link then the symbolic link is the "final target" and its permissions are used; but if you try to access something where a symbolic link is just part of the path then the symbolic link isn't the "final target" and its permissions are ignored.
This is just faster and simpler. For example, if you open the file "foo/bar/baz" then VFS doesn't waste time checking permissions for "foo" or "bar" - it only finds the inode for "baz" (using the hash) and checks permissions for that.
Note: For my latest design I'm shifting all permission checking (not just for file IO) into a "security manager" process/module. To check permissions the VFS sends a message to the security manager containing the details, and the security manager sends back a "permitted/not permitted" reply. This has multiple advantages, but it also has a major performance disadvantage caused by the overhead of messaging (potential task switches, etc). Doing a permission check for each thing along a path (rather than only doing one permission check for the "final target") would turn that major performance disadvantage into a massive disaster.
davidv1992 wrote:Third, how do you keep the hash table up to date efficiently when the user starts modifying symbolic links?
This is where things get a little tricky.
My file system is versioning. What this means is that you can't modify files or symbolic links. Instead you create a new version of the file or symbolic link and then (optionally) delete the old version. Creating a new version of an existing symbolic link is no different to creating any new symbolic link (VFS will start using/caching it, and add any "alternative paths" to any inodes that can be reached by it if/when they're used). If the old version isn't deleted, then the old version continues to behave the same as it always did (but because it's not the current/newest versions it can only be accessed by specifying a specific version in the path - e.g. "/foo:123456789/bar.txt" will use version 123456789 of "foo" instead of using the newest/current version).
If the old version is deleted, then the VFS sets a "deleted" flag on the symbolic link itself (but doesn't free/remove the symbolic link, so that it can be "undeleted" later - deleted things are only freed/removed when the OS is running out of disk space and has to do garbage collection). The "deleted" flag is checked when anything is used, and if a symbolic link's "deleted" flag is set then the symbolic link will be ignored during "VFS cache miss or bad path" handling. Then VFS removes the corresponding hash table entry; and then the VFS searches inodes and removes any "alternative paths" that might (or might not) have used the symbolic link. For example, if "/first/second/foo" is the symbolic link that was deleted, then if an inode has the path "/bar/baz.txt" (which can't contain any symbolic links) and has the alternative path "/foo/baz.txt" (that must contain at least one symbolic link somewhere), then that alternative path is removed from the inode because it contains "foo", even if it's a completely unrelated "foo" and the alternative path didn't need to be removed at all. If the alternative path was removed when it didn't need to be then the VFS will just add it back later as part of the "VFS cache miss or bad path" handling (if/when the alternative path is used again). This "overly aggressive" approach avoids the need to do an ugly recursive search for cases where multiple symbolic links are used within the same alternative path. Essentially; rather than determining if "/foo/baz.txt" definitely did or definitely didn't rely on the deleted symbolic link, it's faster/easier to just assume it did.
Cheers,
Brendan
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 11:44 am
by Korona
Brendan, I think your approach is quite nice if one does not aim for POSIX compatibility. Indeed, multi-versioning is an elegant way to handle concurrent changes to the file system structure.
However, I also have to defend POSIX here:
Brendan wrote:This is why I wrote "
Note: POSIX may or may not be compatible with "should"." in my original post - I wasn't entirely sure, but suspected that POSIX is a hideous disaster that gets everything wrong (and that there's a huge amount of software that has bugs in the presence of symbolic links because POSIX does it wrong).
While POSIX is one of the most inconsistent APIs that one can imagine, I don't think it gets path resolution wrong. As I stated earlier, POSIX takes the pragmatic approach here and prefers simplicity of design and performance over more complex solutions. By not supporting hard links unlink(), rename() and similar operations get a deterministic constant worst-case running time. Yes, your multi-versioning scheme also provides that, but it is certainly more complex and harder to understand for end-users, especially for end-users that are used to work on "traditional" systems.
I do think that the POSIX semantics of "cd" are incredibly broken though. Having "cd" not match chdir() is just stupid.
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 4:20 pm
by Brendan
Hi,
Korona wrote:Brendan, I think your approach is quite nice if one does not aim for POSIX compatibility. Indeed, multi-versioning is an elegant way to handle concurrent changes to the file system structure.
However, I also have to defend POSIX here:
Brendan wrote:This is why I wrote "
Note: POSIX may or may not be compatible with "should"." in my original post - I wasn't entirely sure, but suspected that POSIX is a hideous disaster that gets everything wrong (and that there's a huge amount of software that has bugs in the presence of symbolic links because POSIX does it wrong).
While POSIX is one of the most inconsistent APIs that one can imagine, I don't think it gets path resolution wrong. As I stated earlier, POSIX takes the pragmatic approach here and prefers simplicity of design and performance over more complex solutions. By not supporting hard links unlink(), rename() and similar operations get a deterministic constant worst-case running time. Yes, your multi-versioning scheme also provides that, but it is certainly more complex and harder to understand for end-users, especially for end-users that are used to work on "traditional" systems.
I do think that the POSIX semantics of "cd" are incredibly broken though. Having "cd" not match chdir() is just stupid.
The intuitive behaviour is that going "up" to a parent directory is the opposite of going "down" into a subdirectory; and that "down then up" (e.g. "cd foo/..") leaves you back where you started; because this is how everything (except symbolic links) does it. Because it's not intuitive, people that are used to "traditional" systems (including the developers of
PHP,
Nautilus,
XnView,
Oracle's Java and
GNU coreutils) consistently make mistakes. By making the behaviour of "cd" broken and stupid so that it matches the broken and stupid behaviour of "chdir" you'd just end up with a whole lot more people making mistakes.
Cheers,
Brendan
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 5:04 pm
by Korona
The intuitive behavior of symbolic links is that they redirect you to another place in the directory tree. They do not join two places in the file system. That is what hard links do. If x is a symlink to y/z and I type "cd x", the intuitive thing would be for the prompt to just show "y/z". That way it would be clear that "cd .." takes me back to "y". The intuitive behavior is not to pretend that symlinks are hard links. "cd" pretends that symlinks are hard links and that is broken. People have a wrong idea of what a symlink does because "cd" special cases symlinks. Note the concept of a symlink is to blame here but the stupid specification of "cd". If "cd" behaved sanely, nobody would be confused about how symlinks work.
chdir() is not broken and it is also not a special case. chdir() resolves paths exactly the same way as open() or any other function that takes a path. It's "cd" that is the special case here. Note that chdir() is not the only way to get a persistent handle to a path. You can also open() a directory and then use openat() to navigate through it. All these functions behave consistently. "cd" breaks this consistency. The "cd" behavior fails if the symlink is changed, moved or deleted. The general path resolution algorithm handles all these cases correctly.
You can argue that POSIX should support hard links to directories so that users can still get the "cd" behavior if they want it. That is a valid concern. Unfortunately, hard links to directories cannot be efficiently implemented in traditional non-versioned file systems. Hard links to directories prevent garbage collection via reference counting and require full tracing to reclaim unreachable directories. That is an unacceptable penalty in many situations. Luckily, a limited emulation of directory hard links is available on modern UNIX systems in the form of bind mounts.
How should path resolution in a traditional UNIX system work if the symlink concept is broken according to you? What is your suggested fix?
EDIT: Note that you still need full tracing (to clean up in-memory inodes) even if you "emulate" directory hard links in the VFS. Just pretending that symlinks are hard links does not work.
Also note that bind mounts to not suffer from that problem because they create another view of the file system and inside this view everything is still a tree.
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 6:07 pm
by Brendan
Hi,
Korona wrote:The intuitive behavior of symbolic links is that they redirect you to another place in the directory tree. They do not join two places in the file system. That is what hard links do. If x is a symlink to y/z and I type "cd x", the intuitive thing would be for the prompt to just show "y/z". That way it would be clear that "cd .." takes me back to "y". The intuitive behavior is not to pretend that symlinks are hard links. "cd" pretends that symlinks are hard links and that is broken. People have a wrong idea of what a symlink does because "cd" special cases symlinks. Note the concept of a symlink is to blame here but the stupid specification of "cd". If "cd" behaved sanely, nobody would be confused about how symlinks work.
No. The intuitive behaviour is that the file system is a hierarchical tree (without idiotic "one way teleportation traps" waiting for unsuspecting victims that appear and disappear as symbolic links are created, modified and removed); and people spent a lot of effort to make "cd" behave as people expect despite the fact that the underlying "chdir()" is a crippled joke designed by incompetent fools that cared more about their own convenience than the pain and complexity they were creating for everyone else.
Korona wrote:You can argue that POSIX should support hard links to directories so that users can still get the "cd" behavior if they want it. That is a valid concern. Unfortunately, hard links to directories cannot be efficiently implemented in traditional non-versioned file systems. Hard links to directories prevent garbage collection via reference counting and require full tracing to reclaim unreachable directories. That is an unacceptable penalty in many situations. Luckily, a limited emulation of directory hard links is available on modern UNIX systems in the form of bind mounts.
Note that nothing I have said in this entire topic (and as far as I can remember, nothing I have ever said on these forums ever) has involved hard links.
Korona wrote:How should path resolution in a traditional UNIX system work if the symlink concept is broken according to you? What is your suggested fix?
A traditional UNIX system should continue being a crippled, broken and idiotic joke; because actually fixing problems, improving anything, and doing things properly would break compatibility (and because compatibility is the only reason Unix exists). My suggested fix is to throw Unix in the trash where it belongs so that future generations of people don't have to suffer.
Cheers,
Brendan
Re: how do POSIX systems determine the current working direc
Posted: Thu Nov 02, 2017 9:00 pm
by Notturno
Good discussion.
Re: how do POSIX systems determine the current working direc
Posted: Fri Nov 03, 2017 3:08 am
by Korona
Brendan wrote:No. The intuitive behaviour is that the file system is a hierarchical tree (without idiotic "one way teleportation traps" waiting for unsuspecting victims that appear and disappear as symbolic links are created, modified and removed); and people spent a lot of effort to make "cd" behave as people expect despite the fact that the underlying "chdir()" is a crippled joke designed by incompetent fools that cared more about their own convenience than the pain and complexity they were creating for everyone else.
It's "two way teleportation portals" that break the hierarchy. One way teleportation traps at least don't break the expectation that the file system is a tree. With two way teleportation portals, the "canonical descendant of" property is no longer a partial order of files, which ia exactly what we expected from a hierarchy.
Brendan wrote:Note that nothing I have said in this entire topic (and as far as I can remember, nothing I have ever said on these forums ever) has involved hard links.
Well, you did not name them. Two way teleportation portals
are hard links (at least, their expected behavior with respect to VFS inode lifetimes and path resolution is the same). Call them whatever you want - the problems that I wrote about remain.
Brendan wrote:A traditional UNIX system should continue being a crippled, broken and idiotic joke; because actually fixing problems, improving anything, and doing things properly would break compatibility (and because compatibility is the only reason Unix exists). My suggested fix is to throw Unix in the trash where it belongs so that future generations of people don't have to suffer.
Sure, throw throw away UNIX. But that was not the point of my question. What do you replace the UNIX path resolution with? Demand that every file system has to do multi-versioning in 2017? That answer is a bit simplistic, isn't it?
---
Actually, I put some more thought into this and I think that hard link-like (two way teleportation portal) behavior can actually be supported sanely. Replace them by "bind links" that perform spontaneous bind mounts which only stay valid while there is a reference (i.e. a process' working directory or a file descriptor etc.) to a file behind the bind mount. Detach the bind mount from the file system if the link is modified, renamed or removed.
This sometimes leaves processes in unreachable views of the file system and breaks the canonical order but does not suffer from the lifetime problems real hard links suffer from. It might actually be useful in some situations but is by no means a replacement for real symlinks aka one way teleportation traps.
Re: how do POSIX systems determine the current working direc
Posted: Wed Nov 15, 2017 10:25 am
by mariuszp
That all makes sense but there is another problem.
If a filesystem is mounted in multiple locations, testing on Linux shows that in all of these locations, ".." points to the parent of the mountpoint. How is this handled?
If the kernel simply stores the VFS inode of the working directory, how is it supposed to know which mountpoint was used to reach it?
Re: how do POSIX systems determine the current working direc
Posted: Wed Nov 15, 2017 10:40 am
by davidv1992
I am not sure how linux does this internally, but my os handles this by essentially having different vfs inodes for each of these mount points. In fact, in my system, if a filesystem is mounted in multiple locations, each of those mount locations has vfs inodes completely separate from all the others. This leads to some duplication of information in those cases, but I decided they are rare enough in practice so as to not matter too much
Re: how do POSIX systems determine the current working direc
Posted: Wed Nov 15, 2017 10:59 am
by mariuszp
I suppose it is also possible to just disallow multiple mountpoints (which just leaves me to figure out an elegant way to move "/dev" etc from the kernel root directory into the real root directory from the disk... i guess i'll have a remount() function).
Re: how do POSIX systems determine the current working direc
Posted: Wed Nov 15, 2017 11:41 am
by Korona
In Linux path resolution considers the mount point in addition to the inode.
I don't remember how exactly Linux does it. The canonical implementation (that might very well coincide with the Linux implementation; I study Linux quite a lot when working on my OS) is something like the following: Each mount point is represented by a struct. This struct stores a pointer to the parent mount point, the dentry where it is mounted inside the parent, a pointer to the root dentry of the mounted file system and a map from dentries to mount points that represent the children of this mount point. In other words, the mount points form a tree, but that tree is "more flat" than the directory tree as it does not include intermediate directories. This allows for a nice implementation of mount namespace, bind mounts and master/slave mounts just like in Linux.
Re: how do POSIX systems determine the current working direc
Posted: Mon Feb 12, 2018 4:57 am
by thomasloven
This was a very interesting discussion.
A funny thing I found is that sh and bash does not seem to expand symlinks.
Fish does, however.
Code: Select all
archakis:~/tmp > sh
sh-4.4$ pwd
/home/thomas/tmp
sh-4.4$ ls -l
totalt 8
lrwxrwxrwx 1 thomas thomas 3 12 feb 11.44 a -> b/c
drwxr-xr-x 3 thomas thomas 4096 12 feb 11.44 b
-rw-r--r-- 1 thomas thomas 270 12 feb 11.47 test.c
sh-4.4$ cd a
sh-4.4$ pwd
/home/thomas/tmp/a
sh-4.4$ cd ..
sh-4.4$ pwd
/home/thomas/tmp
sh-4.4$ fish
archakis:~/tmp > cd a
archakis:~/t/b/c > pwd
/home/thomas/tmp/b/c
archakis:~/t/b/c > cd ..
archakis:~/t/b > pwd
/home/thomas/tmp/b
archakis:~/t/b > cd ..
archakis:~/tmp > cat test.c
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv)
{
char cwd[1024];
getcwd(cwd, 1024);
printf("%s\n", cwd);
chdir("a");
getcwd(cwd, 1024);
printf("%s\n", cwd);
chdir("..");
getcwd(cwd, 1024);
printf("%s\n", cwd);
return 0;
}
archakis:~/tmp > make test
cc test.c -o test
archakis:~/tmp > ./test
/home/thomas/tmp
/home/thomas/tmp/b/c
/home/thomas/tmp/b
archakis:~/tmp >