Page 1 of 2

how do POSIX systems determine the current working directory

Posted: Tue Oct 31, 2017 1:41 pm
by mariuszp
My OS currently uses a really bad system for managing mountpoints, and basically required the path to the working directory to be stored and it has to scan from the root of the filesystem every time it wants to access any file (even for relative paths - it just converts the relative path to absolute and dereferences it).

I want to change this so it works in a more "UNIX-y" way - the current directory is just some inode and we can begin dereferencing from it (and looking up mountpoints by dev/inode number).

The problem is... how does the kernel then know what path to store in /proc/self/cwd ? do I have to store the path used to reach every directory inode? linux doesn't seem to have a path in "struct inode", so i think i must be missing some obvious solution.

Re: how do POSIX systems determine the current working direc

Posted: Tue Oct 31, 2017 1:52 pm
by iansjack
I'm not quite sure that I understand your problem. As you say, a directory is referenced by an inode, so just store the inode number in the current directory field.

Re: how do POSIX systems determine the current working direc

Posted: Tue Oct 31, 2017 2:42 pm
by mariuszp
iansjack wrote:I'm not quite sure that I understand your problem. As you say, a directory is referenced by an inode, so just store the inode number in the current directory field.
Yes, that's the idea. But now, how do I implement getcwd() ? do I just have to store the name of the directory in the inode, and look up names of all parent directories?

Re: how do POSIX systems determine the current working direc

Posted: Tue Oct 31, 2017 3:22 pm
by Brendan
Hi,
mariuszp wrote:
iansjack wrote:I'm not quite sure that I understand your problem. As you say, a directory is referenced by an inode, so just store the inode number in the current directory field.
Yes, that's the idea. But now, how do I implement getcwd() ? do I just have to store the name of the directory in the inode, and look up names of all parent directories?
Whenever the current working directory is set or changed, store the new path string somewhere and find the inode and store that too. When someone does "getcw()" you return the path you stored.

Don't forget that (because of things like symbolic links) there can be any number of paths to an inode. For an example of why that matters; if "/a/foo" is a symbolic link to "/b/bar" and the current working directory is "a", then if you do "chdir("foo/..")" then you should (see note) end up with the current working directory still being "/a" and not changed to "/b".

Note: POSIX may or may not be compatible with "should".


Cheers,

Brendan

Re: how do POSIX systems determine the current working direc

Posted: Tue Oct 31, 2017 4:06 pm
by Korona
POSIX (or any other system I know of) does not support hard links to directories, so the canonical path of a directory can be constructed by following .. links.

Just store a chain of directory entries that point to the current directory.

Re: how do POSIX systems determine the current working direc

Posted: Tue Oct 31, 2017 7:23 pm
by Brendan
Hi,
Korona wrote:
Brendan wrote:Don't forget that (because of things like symbolic links) there can be any number of paths to an inode.
POSIX (or any other system I know of) does not support hard links to directories, so the canonical path of a directory can be constructed by following .. links.
As far as I know; symbolic links (not hard links) to directories have been a standard part of POSIX for decades.

I've been using symbolic links (not hard links) to directories on Linux for ages; mostly so that I can do "instant change-over" of (parts of) my web site by just changing the symbolic link to point to a completely new/different directory structure.


Cheers,

Brendan

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 4:13 am
by linguofreak
Korona wrote:POSIX (or any other system I know of) does not support hard links to directories, so the canonical path of a directory can be constructed by following .. links.

Just store a chain of directory entries that point to the current directory.
Unix System V apparently allowed hard links to directories, but only root could create them.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 4:52 am
by Korona
Brendan wrote:As far as I know; symbolic links (not hard links) to directories have been a standard part of POSIX for decades.
Sure. But symbolic links do not change the canonical path to a file. If you change your working directory to x/ and x is a symlink to y/z/, then changing it to .. will land you in y/.

EDIT: Oh, you actually talked about that before. I misread your previous post:
Brendan wrote:Don't forget that (because of things like symbolic links) there can be any number of paths to an inode. For an example of why that matters; if "/a/foo" is a symbolic link to "/b/bar" and the current working directory is "a", then if you do "chdir("foo/..")" then you should (see note) end up with the current working directory still being "/a" and not changed to "/b".
That is actually wrong. You are confusing symbolic links and hard links here. What you're describing would be a hard link to a directory. Symlink links change how name resolution is done: A symlink link from /a/foo to /b/bar essential says: Look at /b/bar instead of foo in /a. After path resolution is done, /a/foo and /b/bar refer to exactly same canonical path. If foo would be a hard link (on an imaginary system that supported it), /a/foo and /b/bar would still be different canonical paths (that end in the same directory). Symbolic links do not change the directory tree! Hard links would change the tree, but they come with a bunch of problems, so no system really supports them.

Conceptually, there is a difference between canonical paths and directories. However, the assumption that there are no hard links to directories implies that directories and canonical paths to directories are in 1:1 correspondence. Note that this is not true for files, as we allow hard links to files.
linguofreak wrote:Unix System V apparently allowed hard links to directories, but only root could create them.
Okay, but if you actually did that, you certainly broke POSIX semantics. Hard links to directories break the assumption that the directory structure forms a tree. This implies that a) files no longer have canonical names b) directories can become detached from the file system root, even if their link count is greater than 1 c) as a result of that, unlink() has to perform a full DFS through the subdirectory tree(i.e. an unbounded number of I/O operations) to determine if a directory becomes detached.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 5:08 am
by mariuszp
Ah, so I have to store the path string (and canonicalize it to remove ".." and "." etc).

This is what I was doing the whole time, but the problem I was trying to fix was that right now the kernel stores mountpoints as path prefixes, so I always have to find the absolute path in order to figure out which filesystem the given path is on.

Now I want to store the current working directory, system root etc as a "directory handle", but it seems that as you say I must ALSO store the path. The mountpoint table will now have "parent device and inode" information instead of paths so I an just look at "struct stat" to figure out if a given entry is a mountpoint.

But since as you say a link could point at a directory... what should ".." point to? Do I have to, for each directory handle, store what path got me there too?

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 5:38 am
by Korona
Well, the natural way to implement it is as follows:

The current working directory is just a pointer to a VFS inode (i.e. the in-memory and not the on-disk representation of the inode). Mount points are identified by VFS directory entries. Let's denote directory entries as dentries instead as that is the terminology that Linux uses.

A dentry is essentially a pair of an inode pointer (the containing directory) and a string (the name of the file/directory/whatever). Each directory inode needs to store the dentry that it corresponds to. This inode-dentry chain forms the canonical path of each inode. For operations like getcwd() or ttyname() (or even readlink() on /proc/42/fd/21, if you want to support that) you can simply walk along this chain until you reach /.

Lets say that /x/y is a mount point. The dentry of /x/y is the pair (inode of x, "y"). The inode of x remembers that its dentry is (inode of /, "x"). How is /x/y/z resolved? The VFS starts with the inode of /. It asks whether there is a child named "x" in /. The underlying FS driver finds this child and returns the inode of x. Now the VFS recognizes that (inode of x, "y") is a mount point and continues the search on another device. Now, if the path had been relative, the only thing that would have changed is the inode we started with.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 8:49 am
by mariuszp
OK but what if /a is a symbolic link to /b/c.

Someone sets the working directory to /a. Now, the current directory is the VFS inode for "/b/c". So, the directory ".." would be "/b"; is this the correct behaviour? Or should ".." still point to "/", in which case I must remember the path that was used to reach this inode?

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 9:17 am
by iansjack
if the current working directory is /a then .. is /. Whether /a is a link to a deeper directory is irrelevant. By storing the cwd as /a you automatically "remember" the path that you used to get to it.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 10:15 am
by davidv1992
This post has gotten a bit larger than I had hoped, but I think it is important enough to post it anyway.

It is good to realize that there are two viewpoints as to how to handle .. people are presenting here:

Option 1 (POSIX):
One option is to take the view that .. always points to the same thing, the "canonical" parent of the directory. In this view, the directory structure is a pure tree, and when we follow a symlink, we forget that we ever did that.

This has the advantage that the current position is everything you need to determine what any name, including .., means. Furthermore, if you implement the chdir function, this view conforms to the POSIX standard, which might make porting easier in some cases.

The disadvantage is that this breaks the notion that /a/b/../c/d always equals /a/c/d, which can be counter-intuitive towards users. It is also part of the cause of the admittedly somewhat crazy rules for the shell cd command in the posix standard.

Option 2
The alternative is to take the view that to be in a directory always means to have specified some path from root to that directory. Accessing .. then translates to removing the last element of that path.

This has as an advantage that .. in a larger path will always behave as expected, so specifying /a/b/../c/d always means /a/c/d. Hence, one can always simplify path without worrying that this might actually change their meaning.

However, this scheme also has some disadvantages: First of all, everywhere where you access directories using some form of reference, that reference needs to either point or contain the path used to generate this. This would most likely require some amount of extra storage. Furthermore, you end up with some other strange behaviour: for example, two programs could have a handle to the same point in a directory tree, but which contain different origin paths. In that case, they would get different results when interrogating the properties of what the .. entry in that directory points to.

Further notes
Both options are valid ways of implementing the behaviour of symbolic links. Each has their advantages and disadvantages, and you will need to decide what is most important in your system.

For some example implementations, the linux kernel, or at least the glibc/linux chdir function, behaves according to option 1. In contrast, bash behaviour mostly seems to follow option 2, which it seems to achieve by keeping track of the current working directory internally as well.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 10:35 am
by Korona
mariuszp wrote:OK but what if /a is a symbolic link to /b/c.

Someone sets the working directory to /a. Now, the current directory is the VFS inode for "/b/c". So, the directory ".." would be "/b"; is this the correct behaviour? Or should ".." still point to "/", in which case I must remember the path that was used to reach this inode?
Yes, that is the POSIX behavior. The current directory should be /b and not /.
iansjack wrote:if the current working directory is /a then .. is /. Whether /a is a link to a deeper directory is irrelevant. By storing the cwd as /a you automatically "remember" the path that you used to get to it.
That does not match POSIX semantics. You and probably many others may confuse this because of the behavior of the "cd" shell builtin. Note that "cd" does not behave like chdir()! POSIX specifies how path resolution is done and chdir() is not special cased.

Try
mkdir -p y/z && ln -s y/z x && gcc main.c && ./a.out
where main.c is given by

Code: Select all

#include <stdio.h>
#include <unistd.h>

int main() {
	chdir("x");
	chdir("..");

	char s[1024];
	getcwd(s, 1024);
	printf("Current directory: %s\n", s);
}
This will print "y" on any POSIX compatible system.

EDIT: Also note that if you remove the second chdir(), getcwd() returns "y/z" and not "x".

If you implement it any other way, you'll break POSIX semantics (and all programs that depend on this behavior; I'm sure there are actually tons of programs that depend on the tree structure of canonical paths). The thread's title explicitly asks about POSIX semantics.

EDIT: As a side note: Yes, blindly dropping components like foo/.. from paths is a bug. Use realpath() to resolve paths.

Re: how do POSIX systems determine the current working direc

Posted: Wed Nov 01, 2017 5:03 pm
by Brendan
Hi,
davidv1992 wrote:This post has gotten a bit larger than I had hoped, but I think it is important enough to post it anyway.

It is good to realize that there are two viewpoints as to how to handle .. people are presenting here:

Option 1 (POSIX):
One option is to take the view that .. always points to the same thing, the "canonical" parent of the directory. In this view, the directory structure is a pure tree, and when we follow a symlink, we forget that we ever did that.

This has the advantage that the current position is everything you need to determine what any name, including .., means. Furthermore, if you implement the chdir function, this view conforms to the POSIX standard, which might make porting easier in some cases.

The disadvantage is that this breaks the notion that /a/b/../c/d always equals /a/c/d, which can be counter-intuitive towards users. It is also part of the cause of the admittedly somewhat crazy rules for the shell cd command in the posix standard.

Option 2
The alternative is to take the view that to be in a directory always means to have specified some path from root to that directory. Accessing .. then translates to removing the last element of that path.

This has as an advantage that .. in a larger path will always behave as expected, so specifying /a/b/../c/d always means /a/c/d. Hence, one can always simplify path without worrying that this might actually change their meaning.

However, this scheme also has some disadvantages: First of all, everywhere where you access directories using some form of reference, that reference needs to either point or contain the path used to generate this. This would most likely require some amount of extra storage. Furthermore, you end up with some other strange behaviour: for example, two programs could have a handle to the same point in a directory tree, but which contain different origin paths. In that case, they would get different results when interrogating the properties of what the .. entry in that directory points to.

Further notes
Both options are valid ways of implementing the behaviour of symbolic links. Each has their advantages and disadvantages, and you will need to decide what is most important in your system.

For some example implementations, the linux kernel, or at least the glibc/linux chdir function, behaves according to option 1. In contrast, bash behaviour mostly seems to follow option 2, which it seems to achieve by keeping track of the current working directory internally as well.
This is why I wrote "Note: POSIX may or may not be compatible with "should"." in my original post - I wasn't entirely sure, but suspected that POSIX is a hideous disaster that gets everything wrong (and that there's a huge amount of software that has bugs in the presence of symbolic links because POSIX does it wrong). ;)

For the way I do it (which is definitely not compatible with POSIX for many reasons); a current working directory is just a "prefix string" (with no associated inode) that's maintained by the process itself; and the kernel and VFS only ever sees canonical absolute paths. E.g. if a process does "chdir("/foo/../bar");" it does pure text manipulation to set the prefix string to "/bar/" itself (possibly without checking or caring if "/foo" or "/bar" exist if that's what the process wanted to do), then if the process opens the file "baz" it prepends the prefix string itself and asks VFS to open "/bar/baz". If the process doesn't want to use any current working directory at all, that's fine (e.g. it can use absolute paths for everything). If the process wants to give each thread its own independent current working directory, that's fine too. Of course this could all be delegated to a library (potentially including having a library that emulates POSIX brokenness).

For the VFS; it converts the absolute path to a hash, then tries to find the inode in a glorified hash table. If it's not found it chops the last path separator and whatever followed it off of the end of the string (e.g. the string "/foo/bar/baz" becomes "/foo/bar") and retries to find the parent directory's inode, and so on (until it finds something that is in its cache, and can start doing "VFS cache miss or bad path" handling). Keeping track of the current working directory's inode does not speed this up and doesn't improve performance at all. Instead; nothing outside the VFS is allowed to have a reference to the VFS's internal data (separation of concerns) and the VFS's ability to manage its caches (including "least recently used inode eviction") isn't complicated by lots of processes holding references to inodes that aren't associated with open file/directory descriptors.

The downside is that if/when something has multiple absolute paths (e.g. due to symbolic links) it may have multiple hash table entries and each of its paths (that VFS knows about) need to be stored in the inode to complete hash table lookup.

Note 1: In theory (I've never implemented it); the VFS's hash table can be distributed across multiple VFS threads where an absolute path is converted to "thread number and hash for that thread"; and where each VFS thread can manage its own private hash table and its own private caches. This could have benefits for cache locality and scalability (especially many CPU and NUMA systems), and in some cases would allow the VFS process to benefit from my OS's thread specific storage (e.g. on a 32-bit system with PAE; a VFS process with 128 threads can use up to 256 GiB of RAM for caches because each thread has 2 GIB of thread specific storage).

Note 2: If I ever implement something like "chroot()", it'd essentially be the same - just a simple string that's prefixed to a process' absolute paths (by VFS instead of by the process itself).


Cheers,

Brendan