Page 4 of 9

Re:Universal File System

Posted: Fri Jan 13, 2006 2:57 am
by Kemp
Add one to the value of the first character of the short filename. As it no longer directly corresponds to the long filename you can say it's a file identifier not a file name. And as it isn't compatible with the MS implementation (or spec) it shouldn't be covered by the patent anymore. You're not storing a short filename and a long filename, you're storing a long filename and an alphanumeric file identifier, you just happen to store them in the same structures as in the FAT system (which you say MS hasn't patented).

Re:Universal File System

Posted: Fri Jan 13, 2006 3:21 am
by Candy
Some thing for our FS:

Seeing as flash based disks are on the uprising, what if we include a few flash-specific bits?

One I'm thinking of is enabling wear reduction on the disks, by keeping two pointers as to where was last written (one for data one for inodes). That way, writing a new one is slightly faster in average case (equivalent to next fit vs first fit) and wear is spread out more or less evenly on the entire disk. The first bit might still wear a lot faster, but at least it won't wear like the start of a FAT disk will.

Also, I'm kind of for a fixed number of inode locations. That reduces complexity for writing them considerably (especially since you can always create a file if there are available, and you can not create a file when it's full, no weird situation with a 4gb file right after the last inode that you can't move causing you to not be able to create any more files).

Re:Universal File System

Posted: Fri Jan 13, 2006 4:03 am
by kataklinger
Candy wrote:For the rest, FAT is ok. Assuming there aren't anymore patents, which I doubt. For me, screw microsoft, I'm still going to implement it (not US resident, no aspiration of ever going there). For this project, I'm most certainly going to advocate not using FAT or NTFS at all.
They probably have already patented FAT all over the world. So you must see if it is patented in your country.

Re:Universal File System

Posted: Fri Jan 13, 2006 4:14 am
by Candy
kataklinger wrote: They probably have already patented FAT all over the world. So you must see if it is patented in your country.
I'm certain on the european software patent policy (patent it, but make it impossible to enforce until the decision falls that they will be, in which case all of europe is screwed anyway) and the dutch one (identical to european). Those two apply to me, none of the rest does (except if I explicitly export my software, which I'm not actively doing or going to do).

I think Germany has a somewhat different view on trademarks (I heard some rumor about "public domain" not existing in their view of copyrights) but I can't vow for any other country.

Re:Universal File System

Posted: Fri Jan 13, 2006 4:16 am
by Pype.Clicker
Candy wrote: Some thing for our FS:

Seeing as flash based disks are on the uprising, what if we include a few flash-specific bits?
I should dig for infos about it, but afaik, it's more the matter of the "flash disk controller" to fake a "consistent" disk (e.g. where writing and readings can be done at random places, including writing over and over the same sectors (the FAT ?)) on top of what hardware asks for. I don't think that knowledge should be brought to the FS design itself -- quite much like you don't want to bring actual (regular) disk geometry to the design of the FS ...

Re:Universal File System

Posted: Fri Jan 13, 2006 4:57 am
by bubach
- file names must be 231 characters or less (including any parent directory names)
I like the way Microsoft handles this (even in NTFS!). You can make files with long file names (up to 255 chars), yes. You can make folders with long names (255 chars), yes. But you can not have a file with a name of 255 chars inside a folder with a 255 char name. The limit is 255 for the whole path. So, the max filename would be about 125 chars inside a folder with a name of 125 chars... Clever Microsoft.. :D

Anyway, I would gladly implement this FS in my OS. I like the "KISS" sound of it.

Re:Universal File System

Posted: Fri Jan 13, 2006 5:22 am
by Rob
That's not fully correct bubach. See stdlib.h in the MS SDK:

#define _MAX_PATH 260 /* max. length of full pathname */
#define _MAX_DRIVE 3 /* max. length of drive component */
#define _MAX_DIR 256 /* max. length of path component */
#define _MAX_FNAME 256 /* max. length of file name component */
#define _MAX_EXT 256 /* max. length of extension component */

In other words the total length can be 260 chars long (probably
259, I assume the terminator is included in that length as well).

I'd say that is pretty long already. I don't mind it being "infinite",
but check out this one for example:

D:\This folder name is a total of exactly 78 characters long, that's pretty long!\This folder name is a total of exactly 78 characters long, that's pretty long!\This folder name is a total of exactly 78 characters long, that's pretty long!

That thing is 239 characters long. After that WinXP started to
complain it couldn't create a new folder.

Re:Universal File System

Posted: Fri Jan 13, 2006 5:36 am
by Candy
It's pointless to set any limit. If you can make it limitless without overhead that's good enough. We can.

For practical purposes, I expect more than 80 characters to not be used. For all normal uses as file transfer unit, I expect filenames to stay under 30 bytes for computer professionals and 80 for normal users. That means that with a 128 byte "inode" we can serve all of them.

For pathological cases, I still like being able to extend them. If/when I need to make a long path (copy a full directory to it that happens to go quite deep for instance) I would like it to work.

Re:Universal File System

Posted: Fri Jan 13, 2006 6:01 am
by dushara
Just an idea to run though....

I can't say I properly thought this through, but the idea is roughly this:

Everything is in blocks with a little bit of overhead (Block size I haven't thought of). The header of the block is something like this:

long: file no
long: files block no
word: block version
word: no of bytes written to in block.
The rest of the block contains data. (maybe a CRC field is also needed)..

file no is a unique ID for each file in the system.
files block no allows the file to be fragmented. The block version is used when data in the middle of the file needs to be changed. This essentially makes older version blocks to be considered free space.

Natually the OS needs to scan the disk and build the directory structure etc in memory.

File names (and subdirectories are empty files) will occupy one block and point to the directory block - This could be a little inefficient maybe.

On flash memory, defragmenting is necessary (as far as I know, you can only turn a 1 to a 0 and you can erase sectors at a time).

Anyway, these are rough ideas, is there anything useful here?

Re:Universal File System

Posted: Fri Jan 13, 2006 6:09 am
by Candy
dushara wrote: I can't say I properly thought this through, but the idea is roughly this:

Everything is in blocks with a little bit of overhead (Block size I haven't thought of). The header of the block is something like this:

long: file no
long: files block no
word: block version
word: no of bytes written to in block.
The rest of the block contains data. (maybe a CRC field is also needed)..

file no is a unique ID for each file in the system.
files block no allows the file to be fragmented. The block version is used when data in the middle of the file needs to be changed. This essentially makes older version blocks to be considered free space.
You don't have file no management, you need that to avoid duplicates. You could do that in code, but it would incur additional cost.
File block no's need to be read all before you can tell what's where of a file. That's very slow and memory-hogging. Also, you have a wrap-around condition that doesn't work.
File names (and subdirectories are empty files) will occupy one block and point to the directory block - This could be a little inefficient maybe.

On flash memory, defragmenting is necessary (as far as I know, you can only turn a 1 to a 0 and you can erase sectors at a time).

Anyway, these are rough ideas, is there anything useful here?
Defragmenting is necessary in any case, if you don't support fragmenting files. If you do, it's still occasionally functional to do so, but not required. We would have to defragment to consolidate the free space, but you can do that quite quickly and easily in a program. We might have to do some worst-case analysis to determine how complex to make the defrag program but it should be manageable in any case.

The block wasted on empty files is ok I think. Empty files are not in any way much used, due to obvious reasons (they take space and can't contain data).


---- new message, but since I keep posting here...

Microsoft has a driver development toolkit for file systems, but it's US$109 + S&H (between US$15 - US$25). I don't have that money right now, but I might be able to save up for it. However, since we're doing this as a community, why not save up as a group and point out one person who will be creating the Windows driver?

I'd contribute 10 euro, so that's like US$11, for starters. I'd prefer if somebody else made the windows driver, so it remains a group thing.

The link: Microsoft IFS kit

-------another

We could just base it on ext2ifs and forget the money.

Re:Universal File System

Posted: Fri Jan 13, 2006 7:25 am
by kataklinger
I think that MS DDK is free, but costs some money if you want FAT & ISO9660 drivers. A friend of mine is working on FS extensions (file hiding & crypting...), I'll ask him for more information on that.

Re:Universal File System

Posted: Fri Jan 13, 2006 8:16 am
by bubach
Well I don't care about the exact numbers, my point is that it's totally useless to have filenames that long if the path can't be more then 360 chars anyway.
I can't think of any good reason why anyone would need more then maybe 20-40 chars for a filename.

Re:Universal File System

Posted: Fri Jan 13, 2006 9:27 am
by JoeKayzA
bubach wrote: Well I don't care about the exact numbers, my point is that it's totally useless to have filenames that long if the path can't be more then 360 chars anyway.
I can't think of any good reason why anyone would need more then maybe 20-40 chars for a filename.
You are right in this case - if the the path can't be more than 360 chars. But for the design we've made up here (until now, at least), the file path is stored in the file's name field, so a limit on the filename length has a direct impact on the length of the path, respective the maximum directory depth. (Or did I just get your point wrong?)

In general I second Candy's opinion on scalability: If it is possible, I wouldn't make up any limits other than the available disk space. Not that I would really need a path with 2k characters, but mind that it would save space in the first place for shorter filenames, while still providing the ability to handle longer ones when really needed...

cheers Joe

Re:Universal File System

Posted: Fri Jan 13, 2006 9:28 am
by Colonel Kernel
bubach wrote: I can't think of any good reason why anyone would need more then maybe 20-40 chars for a filename.
Have you ever ripped songs from a soundtrack CD to mp3s? For example:

Code: Select all

Christopher Franke - Babylon 5- Interludes and Examinations - 03 - Londo Meets With Morden-Changing Plans-Reports Of Shadow Attack-In Need Of A Victory-Garibaldi and Hobbs.mp3
That's without the directory name, which is already pretty deep.

Re:Universal File System

Posted: Fri Jan 13, 2006 9:47 am
by Pype.Clicker
Colonel Kernel wrote:
bubach wrote: I can't think of any good reason why anyone would need more then maybe 20-40 chars for a filename.
Have you ever ripped songs from a soundtrack CD to mp3s? For example:
Yes i have. And hopefully enough, there is ID3 tags that virtually any program will prefer to display compared to a filename. Same kind of things occur with pictures (your digicam will just call it 00105613.jpg, but it will have album name, shooting date, etc. in comments) ...

Even for word documents, spreadsheets, pdf files, the 'filename' is less and less meaningful (how many copies of the intel manuals do you have under different names on your system ?)

Personnally, i'd advocate for "filenames" that are long enough to avoid clashes and to allow any decent hand-written name (you know, those readme.txt and the like) and a defined mechanism to export properties (such as an xML file) that UFS cannot handle but that may be handy when you use e.g. E2FS -> UFS -> E2FS.