how do POSIX systems determine the current working directory
-
- Member
- Posts: 223
- Joined: Thu Jul 05, 2007 8:58 am
Re: how do POSIX systems determine the current working direc
@Brendan: I have got a few more further questions regarding your approach: How do you handle access permissions on directories in the presence of symbolic links? Second, how do you handle diretories where the user has access/read permissions on the directory itself, but not on all of the parents on the path it used to get there? Third, how do you keep the hash table up to date efficiently when the user starts modifying symbolic links?
Re: how do POSIX systems determine the current working direc
Hi,
This is just faster and simpler. For example, if you open the file "foo/bar/baz" then VFS doesn't waste time checking permissions for "foo" or "bar" - it only finds the inode for "baz" (using the hash) and checks permissions for that.
Note: For my latest design I'm shifting all permission checking (not just for file IO) into a "security manager" process/module. To check permissions the VFS sends a message to the security manager containing the details, and the security manager sends back a "permitted/not permitted" reply. This has multiple advantages, but it also has a major performance disadvantage caused by the overhead of messaging (potential task switches, etc). Doing a permission check for each thing along a path (rather than only doing one permission check for the "final target") would turn that major performance disadvantage into a massive disaster.
My file system is versioning. What this means is that you can't modify files or symbolic links. Instead you create a new version of the file or symbolic link and then (optionally) delete the old version. Creating a new version of an existing symbolic link is no different to creating any new symbolic link (VFS will start using/caching it, and add any "alternative paths" to any inodes that can be reached by it if/when they're used). If the old version isn't deleted, then the old version continues to behave the same as it always did (but because it's not the current/newest versions it can only be accessed by specifying a specific version in the path - e.g. "/foo:123456789/bar.txt" will use version 123456789 of "foo" instead of using the newest/current version).
If the old version is deleted, then the VFS sets a "deleted" flag on the symbolic link itself (but doesn't free/remove the symbolic link, so that it can be "undeleted" later - deleted things are only freed/removed when the OS is running out of disk space and has to do garbage collection). The "deleted" flag is checked when anything is used, and if a symbolic link's "deleted" flag is set then the symbolic link will be ignored during "VFS cache miss or bad path" handling. Then VFS removes the corresponding hash table entry; and then the VFS searches inodes and removes any "alternative paths" that might (or might not) have used the symbolic link. For example, if "/first/second/foo" is the symbolic link that was deleted, then if an inode has the path "/bar/baz.txt" (which can't contain any symbolic links) and has the alternative path "/foo/baz.txt" (that must contain at least one symbolic link somewhere), then that alternative path is removed from the inode because it contains "foo", even if it's a completely unrelated "foo" and the alternative path didn't need to be removed at all. If the alternative path was removed when it didn't need to be then the VFS will just add it back later as part of the "VFS cache miss or bad path" handling (if/when the alternative path is used again). This "overly aggressive" approach avoids the need to do an ugly recursive search for cases where multiple symbolic links are used within the same alternative path. Essentially; rather than determining if "/foo/baz.txt" definitely did or definitely didn't rely on the deleted symbolic link, it's faster/easier to just assume it did.
Cheers,
Brendan
For all operations; the access permissions at the "final target" are the only permissions that matter. If you try to delete a symbolic link then the symbolic link is the "final target" and its permissions are used; but if you try to access something where a symbolic link is just part of the path then the symbolic link isn't the "final target" and its permissions are ignored.davidv1992 wrote:@Brendan: I have got a few more further questions regarding your approach: How do you handle access permissions on directories in the presence of symbolic links? Second, how do you handle diretories where the user has access/read permissions on the directory itself, but not on all of the parents on the path it used to get there?
This is just faster and simpler. For example, if you open the file "foo/bar/baz" then VFS doesn't waste time checking permissions for "foo" or "bar" - it only finds the inode for "baz" (using the hash) and checks permissions for that.
Note: For my latest design I'm shifting all permission checking (not just for file IO) into a "security manager" process/module. To check permissions the VFS sends a message to the security manager containing the details, and the security manager sends back a "permitted/not permitted" reply. This has multiple advantages, but it also has a major performance disadvantage caused by the overhead of messaging (potential task switches, etc). Doing a permission check for each thing along a path (rather than only doing one permission check for the "final target") would turn that major performance disadvantage into a massive disaster.
This is where things get a little tricky.davidv1992 wrote:Third, how do you keep the hash table up to date efficiently when the user starts modifying symbolic links?
My file system is versioning. What this means is that you can't modify files or symbolic links. Instead you create a new version of the file or symbolic link and then (optionally) delete the old version. Creating a new version of an existing symbolic link is no different to creating any new symbolic link (VFS will start using/caching it, and add any "alternative paths" to any inodes that can be reached by it if/when they're used). If the old version isn't deleted, then the old version continues to behave the same as it always did (but because it's not the current/newest versions it can only be accessed by specifying a specific version in the path - e.g. "/foo:123456789/bar.txt" will use version 123456789 of "foo" instead of using the newest/current version).
If the old version is deleted, then the VFS sets a "deleted" flag on the symbolic link itself (but doesn't free/remove the symbolic link, so that it can be "undeleted" later - deleted things are only freed/removed when the OS is running out of disk space and has to do garbage collection). The "deleted" flag is checked when anything is used, and if a symbolic link's "deleted" flag is set then the symbolic link will be ignored during "VFS cache miss or bad path" handling. Then VFS removes the corresponding hash table entry; and then the VFS searches inodes and removes any "alternative paths" that might (or might not) have used the symbolic link. For example, if "/first/second/foo" is the symbolic link that was deleted, then if an inode has the path "/bar/baz.txt" (which can't contain any symbolic links) and has the alternative path "/foo/baz.txt" (that must contain at least one symbolic link somewhere), then that alternative path is removed from the inode because it contains "foo", even if it's a completely unrelated "foo" and the alternative path didn't need to be removed at all. If the alternative path was removed when it didn't need to be then the VFS will just add it back later as part of the "VFS cache miss or bad path" handling (if/when the alternative path is used again). This "overly aggressive" approach avoids the need to do an ugly recursive search for cases where multiple symbolic links are used within the same alternative path. Essentially; rather than determining if "/foo/baz.txt" definitely did or definitely didn't rely on the deleted symbolic link, it's faster/easier to just assume it did.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: how do POSIX systems determine the current working direc
Brendan, I think your approach is quite nice if one does not aim for POSIX compatibility. Indeed, multi-versioning is an elegant way to handle concurrent changes to the file system structure.
However, I also have to defend POSIX here:
I do think that the POSIX semantics of "cd" are incredibly broken though. Having "cd" not match chdir() is just stupid.
However, I also have to defend POSIX here:
While POSIX is one of the most inconsistent APIs that one can imagine, I don't think it gets path resolution wrong. As I stated earlier, POSIX takes the pragmatic approach here and prefers simplicity of design and performance over more complex solutions. By not supporting hard links unlink(), rename() and similar operations get a deterministic constant worst-case running time. Yes, your multi-versioning scheme also provides that, but it is certainly more complex and harder to understand for end-users, especially for end-users that are used to work on "traditional" systems.Brendan wrote:This is why I wrote "Note: POSIX may or may not be compatible with "should"." in my original post - I wasn't entirely sure, but suspected that POSIX is a hideous disaster that gets everything wrong (and that there's a huge amount of software that has bugs in the presence of symbolic links because POSIX does it wrong).
I do think that the POSIX semantics of "cd" are incredibly broken though. Having "cd" not match chdir() is just stupid.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: how do POSIX systems determine the current working direc
Hi,
Cheers,
Brendan
The intuitive behaviour is that going "up" to a parent directory is the opposite of going "down" into a subdirectory; and that "down then up" (e.g. "cd foo/..") leaves you back where you started; because this is how everything (except symbolic links) does it. Because it's not intuitive, people that are used to "traditional" systems (including the developers of PHP, Nautilus, XnView, Oracle's Java and GNU coreutils) consistently make mistakes. By making the behaviour of "cd" broken and stupid so that it matches the broken and stupid behaviour of "chdir" you'd just end up with a whole lot more people making mistakes.Korona wrote:Brendan, I think your approach is quite nice if one does not aim for POSIX compatibility. Indeed, multi-versioning is an elegant way to handle concurrent changes to the file system structure.
However, I also have to defend POSIX here:
While POSIX is one of the most inconsistent APIs that one can imagine, I don't think it gets path resolution wrong. As I stated earlier, POSIX takes the pragmatic approach here and prefers simplicity of design and performance over more complex solutions. By not supporting hard links unlink(), rename() and similar operations get a deterministic constant worst-case running time. Yes, your multi-versioning scheme also provides that, but it is certainly more complex and harder to understand for end-users, especially for end-users that are used to work on "traditional" systems.Brendan wrote:This is why I wrote "Note: POSIX may or may not be compatible with "should"." in my original post - I wasn't entirely sure, but suspected that POSIX is a hideous disaster that gets everything wrong (and that there's a huge amount of software that has bugs in the presence of symbolic links because POSIX does it wrong).
I do think that the POSIX semantics of "cd" are incredibly broken though. Having "cd" not match chdir() is just stupid.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: how do POSIX systems determine the current working direc
The intuitive behavior of symbolic links is that they redirect you to another place in the directory tree. They do not join two places in the file system. That is what hard links do. If x is a symlink to y/z and I type "cd x", the intuitive thing would be for the prompt to just show "y/z". That way it would be clear that "cd .." takes me back to "y". The intuitive behavior is not to pretend that symlinks are hard links. "cd" pretends that symlinks are hard links and that is broken. People have a wrong idea of what a symlink does because "cd" special cases symlinks. Note the concept of a symlink is to blame here but the stupid specification of "cd". If "cd" behaved sanely, nobody would be confused about how symlinks work.
chdir() is not broken and it is also not a special case. chdir() resolves paths exactly the same way as open() or any other function that takes a path. It's "cd" that is the special case here. Note that chdir() is not the only way to get a persistent handle to a path. You can also open() a directory and then use openat() to navigate through it. All these functions behave consistently. "cd" breaks this consistency. The "cd" behavior fails if the symlink is changed, moved or deleted. The general path resolution algorithm handles all these cases correctly.
You can argue that POSIX should support hard links to directories so that users can still get the "cd" behavior if they want it. That is a valid concern. Unfortunately, hard links to directories cannot be efficiently implemented in traditional non-versioned file systems. Hard links to directories prevent garbage collection via reference counting and require full tracing to reclaim unreachable directories. That is an unacceptable penalty in many situations. Luckily, a limited emulation of directory hard links is available on modern UNIX systems in the form of bind mounts.
How should path resolution in a traditional UNIX system work if the symlink concept is broken according to you? What is your suggested fix?
EDIT: Note that you still need full tracing (to clean up in-memory inodes) even if you "emulate" directory hard links in the VFS. Just pretending that symlinks are hard links does not work.
Also note that bind mounts to not suffer from that problem because they create another view of the file system and inside this view everything is still a tree.
chdir() is not broken and it is also not a special case. chdir() resolves paths exactly the same way as open() or any other function that takes a path. It's "cd" that is the special case here. Note that chdir() is not the only way to get a persistent handle to a path. You can also open() a directory and then use openat() to navigate through it. All these functions behave consistently. "cd" breaks this consistency. The "cd" behavior fails if the symlink is changed, moved or deleted. The general path resolution algorithm handles all these cases correctly.
You can argue that POSIX should support hard links to directories so that users can still get the "cd" behavior if they want it. That is a valid concern. Unfortunately, hard links to directories cannot be efficiently implemented in traditional non-versioned file systems. Hard links to directories prevent garbage collection via reference counting and require full tracing to reclaim unreachable directories. That is an unacceptable penalty in many situations. Luckily, a limited emulation of directory hard links is available on modern UNIX systems in the form of bind mounts.
How should path resolution in a traditional UNIX system work if the symlink concept is broken according to you? What is your suggested fix?
EDIT: Note that you still need full tracing (to clean up in-memory inodes) even if you "emulate" directory hard links in the VFS. Just pretending that symlinks are hard links does not work.
Also note that bind mounts to not suffer from that problem because they create another view of the file system and inside this view everything is still a tree.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: how do POSIX systems determine the current working direc
Hi,
Cheers,
Brendan
No. The intuitive behaviour is that the file system is a hierarchical tree (without idiotic "one way teleportation traps" waiting for unsuspecting victims that appear and disappear as symbolic links are created, modified and removed); and people spent a lot of effort to make "cd" behave as people expect despite the fact that the underlying "chdir()" is a crippled joke designed by incompetent fools that cared more about their own convenience than the pain and complexity they were creating for everyone else.Korona wrote:The intuitive behavior of symbolic links is that they redirect you to another place in the directory tree. They do not join two places in the file system. That is what hard links do. If x is a symlink to y/z and I type "cd x", the intuitive thing would be for the prompt to just show "y/z". That way it would be clear that "cd .." takes me back to "y". The intuitive behavior is not to pretend that symlinks are hard links. "cd" pretends that symlinks are hard links and that is broken. People have a wrong idea of what a symlink does because "cd" special cases symlinks. Note the concept of a symlink is to blame here but the stupid specification of "cd". If "cd" behaved sanely, nobody would be confused about how symlinks work.
Note that nothing I have said in this entire topic (and as far as I can remember, nothing I have ever said on these forums ever) has involved hard links.Korona wrote:You can argue that POSIX should support hard links to directories so that users can still get the "cd" behavior if they want it. That is a valid concern. Unfortunately, hard links to directories cannot be efficiently implemented in traditional non-versioned file systems. Hard links to directories prevent garbage collection via reference counting and require full tracing to reclaim unreachable directories. That is an unacceptable penalty in many situations. Luckily, a limited emulation of directory hard links is available on modern UNIX systems in the form of bind mounts.
A traditional UNIX system should continue being a crippled, broken and idiotic joke; because actually fixing problems, improving anything, and doing things properly would break compatibility (and because compatibility is the only reason Unix exists). My suggested fix is to throw Unix in the trash where it belongs so that future generations of people don't have to suffer.Korona wrote:How should path resolution in a traditional UNIX system work if the symlink concept is broken according to you? What is your suggested fix?
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: how do POSIX systems determine the current working direc
Good discussion.
Last edited by Notturno on Fri Nov 17, 2017 7:33 pm, edited 1 time in total.
Re: how do POSIX systems determine the current working direc
It's "two way teleportation portals" that break the hierarchy. One way teleportation traps at least don't break the expectation that the file system is a tree. With two way teleportation portals, the "canonical descendant of" property is no longer a partial order of files, which ia exactly what we expected from a hierarchy.Brendan wrote:No. The intuitive behaviour is that the file system is a hierarchical tree (without idiotic "one way teleportation traps" waiting for unsuspecting victims that appear and disappear as symbolic links are created, modified and removed); and people spent a lot of effort to make "cd" behave as people expect despite the fact that the underlying "chdir()" is a crippled joke designed by incompetent fools that cared more about their own convenience than the pain and complexity they were creating for everyone else.
Well, you did not name them. Two way teleportation portals are hard links (at least, their expected behavior with respect to VFS inode lifetimes and path resolution is the same). Call them whatever you want - the problems that I wrote about remain.Brendan wrote:Note that nothing I have said in this entire topic (and as far as I can remember, nothing I have ever said on these forums ever) has involved hard links.
Sure, throw throw away UNIX. But that was not the point of my question. What do you replace the UNIX path resolution with? Demand that every file system has to do multi-versioning in 2017? That answer is a bit simplistic, isn't it?Brendan wrote:A traditional UNIX system should continue being a crippled, broken and idiotic joke; because actually fixing problems, improving anything, and doing things properly would break compatibility (and because compatibility is the only reason Unix exists). My suggested fix is to throw Unix in the trash where it belongs so that future generations of people don't have to suffer.
---
Actually, I put some more thought into this and I think that hard link-like (two way teleportation portal) behavior can actually be supported sanely. Replace them by "bind links" that perform spontaneous bind mounts which only stay valid while there is a reference (i.e. a process' working directory or a file descriptor etc.) to a file behind the bind mount. Detach the bind mount from the file system if the link is modified, renamed or removed.
This sometimes leaves processes in unreachable views of the file system and breaks the canonical order but does not suffer from the lifetime problems real hard links suffer from. It might actually be useful in some situations but is by no means a replacement for real symlinks aka one way teleportation traps.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: how do POSIX systems determine the current working direc
That all makes sense but there is another problem.
If a filesystem is mounted in multiple locations, testing on Linux shows that in all of these locations, ".." points to the parent of the mountpoint. How is this handled?
If the kernel simply stores the VFS inode of the working directory, how is it supposed to know which mountpoint was used to reach it?
If a filesystem is mounted in multiple locations, testing on Linux shows that in all of these locations, ".." points to the parent of the mountpoint. How is this handled?
If the kernel simply stores the VFS inode of the working directory, how is it supposed to know which mountpoint was used to reach it?
-
- Member
- Posts: 223
- Joined: Thu Jul 05, 2007 8:58 am
Re: how do POSIX systems determine the current working direc
I am not sure how linux does this internally, but my os handles this by essentially having different vfs inodes for each of these mount points. In fact, in my system, if a filesystem is mounted in multiple locations, each of those mount locations has vfs inodes completely separate from all the others. This leads to some duplication of information in those cases, but I decided they are rare enough in practice so as to not matter too much
Re: how do POSIX systems determine the current working direc
I suppose it is also possible to just disallow multiple mountpoints (which just leaves me to figure out an elegant way to move "/dev" etc from the kernel root directory into the real root directory from the disk... i guess i'll have a remount() function).
Re: how do POSIX systems determine the current working direc
In Linux path resolution considers the mount point in addition to the inode.
I don't remember how exactly Linux does it. The canonical implementation (that might very well coincide with the Linux implementation; I study Linux quite a lot when working on my OS) is something like the following: Each mount point is represented by a struct. This struct stores a pointer to the parent mount point, the dentry where it is mounted inside the parent, a pointer to the root dentry of the mounted file system and a map from dentries to mount points that represent the children of this mount point. In other words, the mount points form a tree, but that tree is "more flat" than the directory tree as it does not include intermediate directories. This allows for a nice implementation of mount namespace, bind mounts and master/slave mounts just like in Linux.
I don't remember how exactly Linux does it. The canonical implementation (that might very well coincide with the Linux implementation; I study Linux quite a lot when working on my OS) is something like the following: Each mount point is represented by a struct. This struct stores a pointer to the parent mount point, the dentry where it is mounted inside the parent, a pointer to the root dentry of the mounted file system and a map from dentries to mount points that represent the children of this mount point. In other words, the mount points form a tree, but that tree is "more flat" than the directory tree as it does not include intermediate directories. This allows for a nice implementation of mount namespace, bind mounts and master/slave mounts just like in Linux.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
-
- Member
- Posts: 89
- Joined: Tue Feb 26, 2008 10:47 am
- Location: Sweden
Re: how do POSIX systems determine the current working direc
This was a very interesting discussion.
A funny thing I found is that sh and bash does not seem to expand symlinks.
Fish does, however.
A funny thing I found is that sh and bash does not seem to expand symlinks.
Fish does, however.
Code: Select all
archakis:~/tmp > sh
sh-4.4$ pwd
/home/thomas/tmp
sh-4.4$ ls -l
totalt 8
lrwxrwxrwx 1 thomas thomas 3 12 feb 11.44 a -> b/c
drwxr-xr-x 3 thomas thomas 4096 12 feb 11.44 b
-rw-r--r-- 1 thomas thomas 270 12 feb 11.47 test.c
sh-4.4$ cd a
sh-4.4$ pwd
/home/thomas/tmp/a
sh-4.4$ cd ..
sh-4.4$ pwd
/home/thomas/tmp
sh-4.4$ fish
archakis:~/tmp > cd a
archakis:~/t/b/c > pwd
/home/thomas/tmp/b/c
archakis:~/t/b/c > cd ..
archakis:~/t/b > pwd
/home/thomas/tmp/b
archakis:~/t/b > cd ..
archakis:~/tmp > cat test.c
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv)
{
char cwd[1024];
getcwd(cwd, 1024);
printf("%s\n", cwd);
chdir("a");
getcwd(cwd, 1024);
printf("%s\n", cwd);
chdir("..");
getcwd(cwd, 1024);
printf("%s\n", cwd);
return 0;
}
archakis:~/tmp > make test
cc test.c -o test
archakis:~/tmp > ./test
/home/thomas/tmp
/home/thomas/tmp/b/c
/home/thomas/tmp/b
archakis:~/tmp >