Page 2 of 2

Re:what are some alternative path methods(like C:\ /linux)

Posted: Sat Jul 22, 2006 2:34 pm
by Candy
evincarofautumn wrote:
Also, will you allow it for directories? That gives a problem when you link a directory into itself, creating a loop, and even more when you subsequently delete the first (entry of loop) entry. You'd be stuck with a directory loop that would never be freed, because it always pointed to itself for use.
I realise that there could be a problem with circular references if a directory kept a reference to itself. The thing that would prevent such circular references from becoming a problem is that data blocks keep a list of references to the blocks that reference them.
The point was more on allowing directories to be hardlinked. Current Unix doesn't allow you to, since it would seriously confuse some older tools and since you could make problems with it.

The one I was aiming at specifically was the expanded link-to-self problem, what if you have a chain of objects that in the end point to the first? Then, if you unlink the link you had it'll be lost forever. This is a problem that's also experienced by garbage collecting memory allocator designers, since they also can't garbage collect a block if it's in a circular reference. Their common solution is to ignore the problem and to make a last-resort GC that checks all links and freezes the program for some seconds.

If you do that on a 250GB disk, or worse, on a RAID system of 4TB, you could be freezing the computer for minutes to hours.

Re:what are some alternative path methods(like C:\ /linux)

Posted: Sat Jul 22, 2006 8:05 pm
by evincarofautumn
I'm afraid I have to disagree. Let's say we have a few directories, [tt]aaa[/tt], [tt]bbb[/tt], and [tt]ccc[/tt]. The following references exist:

[tt]aaa[/tt] contains [tt]bbb[/tt].
[tt]bbb[/tt] contains [tt]ccc[/tt].
[tt]ccc[/tt] contains [tt]aaa[/tt].

If there were no other references to these directories, i.e. the loop was closed and pure, there would indeed be no way to escape it. Let's try to create such a structure, though:

[tt]~# md aaa[/tt]
[tt]~# cd aaa[/tt]
[tt]~/aaa# md bbb[/tt]
[tt]~/aaa# cd bbb[/tt]
[tt]~/aaa/bbb# md ccc[/tt]
[tt]~/aaa/bbb# cd ccc[/tt]
[tt]~/aaa/bbb/ccc# mv ~/aaa ~/aaa/bbb/ccc
error: unable to move ~/aaa: cyclic dependency
hint: clone ~/aaa or ~/aaa/bbb/ccc to another directory[/tt]
[/list]

By checking the references in the chain of parents from [tt]ccc[/tt] to [tt]aaa[/tt], we can infer whether or not [tt]aaa[/tt] can be safely moved. In this case, that requires only two dereferences ([tt]ccc[/tt] to [tt]bbb[/tt] and [tt]bbb[/tt] to [tt]aaa[/tt]), which isn't a lot.

If we require that the [tt]mv[/tt] command's second argument be rooted at [tt]~[/tt], then we know that if all of the directories in the chain upward have only one parent, a cyclic dependency is going to be found.
If you do that on a 250GB disk, or worse, on a RAID system of 4TB, you could be freezing the computer for minutes to hours.
In an extreme case, say, involving, say, five directories with two parents each, the number of dereferences is still limited. That's <=number of references seeks and the same number of reads, which resolves to maybe a few seconds on a 5400rpm drive, depending, of course, on the locations of files on disk. Not a big deal. By caching and sorting the writes (and the reads, where possible), you could probably cut that time significantly.

As I mentioned,
...it is likely that the file is not in more than a few directories...
so the problem of read/write times is minimal. You could even completely disable it to users and allow only system functions to use it, or, as on UNIX, just limit it to files only. An alternative would be to limit the number of parent directories for any given directory to an arbitrary low, convenient number (say, 16) and limit the number of directories' distance from the root (another 16). This would make things like [tt]~/this/is/a/ridiculously/long/path/with/a/huge/amount/of/subdirectories/isnt/it/yes/I/think/it/is[/tt] illegal, as well they should be.

So maybe something as complex as this shouldn't go into my filesystem, at least not in its first version. My reason for coming up with things like this is that I've only worked (in code) with FAT. I find it dissatisfying, unintuitive, and lacking in certain features that spice up such filesystems as ext3.

I'm very sorry for this series of very long posts. ::) Please point out any of my mistakes if you see them!

Re:what are some alternative path methods(like C:\ /linux)

Posted: Sun Jul 23, 2006 2:40 am
by Candy
evincarofautumn wrote: I'm very sorry for this series of very long posts. ::) Please point out any of my mistakes if you see them!
Don't be sorry for long posts, especially when they contain a lot of good insight. I was quite wrong, I thought you'd have to check the entire child tree of your directory and didn't start to think that the parent list was equally correct.

Re:what are some alternative path methods(like C:\ /linux)

Posted: Mon Jul 24, 2006 8:43 am
by evincarofautumn
Don't be sorry for long posts...
I meant that I felt like I was getting off-topic and rambling a bit.
...especially when they contain a lot of good insight.
aww. Thanks. ;D

As for alternative pathing methods... hmm... off the top of my head, I choose a system based on URLs: directories are separated with [tt].[/tt] and read from top to bottom, right to left. File extensions are noted with [tt]_[/tt], and operations on standard filetypes are performed with [tt]:[/tt], e.g.

[tt]open:txt_my text document.my documents.home.local hd[/tt]

It's a bit backward, but you asked for "alternative". ;D

Alternatively (;)) only block devices are indicated with [tt].[/tt], while paths are separated with [tt]/[/tt] as in a standard url. The [tt]:[/tt] for operations is replaced with [tt]://[/tt], and [tt].[/tt] takes the place of [tt]_[/tt]. e.g.:

[tt]open://hd.local.disk/home/my documents/my text document.txt[/tt]

Which is not half bad. ;)