OSDev.org

Posted: **Thu May 01, 2003 11:52 am**

Here are the features I require of the FS for my OS. At the moment I don't know of any FS that has all of these features (not BFS, not ReiserFS, etc.), so this may become a secondary project down the road when I need it.

1. File meta-data. All file components (filename, time/date, file contents, etc.) are just meta-data components, and any other type can be added.

2. Case-insensetivity + case-preservation. This is the Windows/MacOSX method. Filename case is preserved, but references to 'MyFile' and 'myfile' would be the same file.

3. Live queries. The big feature of BFS. Allows you to create pseudo-directories which contain the contents of a query rather than actual directory contents. Create a query for all MP3 files, and you now have a single directory containing all MP3 files in the computer, regardless of where they are. The queries can be made on filenames or any other meta-data.

4. Speed. This includes FS structure overhead, live query processing, and raw data throughput. One of the big reasons I won't just use BFS, it doesn't have the performance. ReiserFS is looking better and better in this department, among others.

5. Journaling. Critical for disaster recovery, to make sure your data is always in a stable state.

6. Space efficiency + expandability. The FS must be able to handle extremely large drives, and extremely large file sizes, without wasting space (like by increasing the cluster size).

I think that'd do it for me.

Posted: **Thu May 01, 2003 12:33 pm**

NTFS has all of these features, although most features of indices (a.k.a. 'directories') aren't exposed to user mode.

Posted: **Wed Jun 04, 2003 3:23 pm**

smurf975 wrote: Are you using it as multi user database server or are you doing a simple desktop OS?

This would be for a desktop OS (maybe more "workstation" than pure "desktop") ...

maybe i'm not seeing things correctly here, but i really feel usual storage/retrieval techniques becomes not to work anymore on large (40GB and more) disks. I've reached a point where when i'm looking for a document which should be somewhere on my disk, i just open "google" and download it for a Nth time ...

Posted: **Fri Jun 06, 2003 4:36 am**

Hi

Creating a filesystem that contains lots of meta data sounds good in theory, but what happens if you zip up the files? What happens if you copy them onto a CD? Where does all the meta-data go?

With Unix and Windows only having the basic directory / hierarchy system to organise files, its probably best to stick with the status quo.

Apparently in Longhorn Micro$oft are keeping NTFS. Their all new database style filesystem has been demoted to a process that runs on top of NTFS.

If M$ can't produce a filesystem based on meta-data and db searches (at least not before 2008 or whenever the next big release is), I don't think its worth us amateurs trying it.

Posted: **Fri Jun 06, 2003 5:50 am**

What will occur if you pack the files ?

The metadata for your files will be collected (at least, those you want to export -- you could for instance teach your system it should not export your "where do the file come from" metadata, while it could export "rating" of your MP3s...) and store them aside in a file added to the archive.

The most obvious thing i see here is to put it as an XML document (for instance one per directory, with an entry per file), so that other systems could easily import it.

Storing metada as a "metadata.xml" file in a ZIP archive makes no problem as it will be packed anyway. Moreover, in the case of a "search within archives too" operation, you just have to unpack one stream from your ZIPs...

The same kind of approach can be used when you export files to a CD, another filesystem or whatever. And when you import files from another file system/CD/net, etc. we'll look for a similar XML.

The reason why i don't keep that XML for local storage of metadata is efficience. As the data have been imported, there's no more need for portability, so no more need for XML

You can hardcode sizes as native-endian 2-complement binary field, replace some strings by UIDs, etc. and work at full speed, keeping metadata where they're likely to be needed, and optionnally building indexes of them for faster queries.

If M$ can't produce a filesystem based on meta-data and db searches (at least not before 2008 or whenever the next big release is), I don't think its worth us amateurs trying it.

I think the contrary! M$ has to SELL its product. If they make the wrong choice at their filesystem, users will not buy, or M$ will have to ship updates for free, make more marketting for reinforcing their image, etc. thus it's a $M risk for M$.
If we - amateurs - give it a try what does it cost ? is it more risky to implement MyUtopistFileSystem rather than NTFS ? virtually nothing but a disk to crash (for sure you don't use your 2GB disk anymore, do you

)

Posted: **Wed Jun 25, 2003 5:37 am**

i've went through the white papers for Reiser FS ... that's quite an impressive stuff. About 90% of the feature i'd like to offer are efficiently simulable with it just because they take special care at optimizing very small files... just one thing that i regret from the "many many small files" approach (especially if you want to structure a large file as a collection of smaller files) is that it requires a lot of "open" system calls, and thus accessing a multi-field object when fields are separated files could eat many CPU cycles just context-switching :-/

Obviously, not forcing "one block belongs to a single file" (but rather grouping several small files on a few blocks) give very good results. Another "trick" they used is to consider the tail of a large file as a small file too

@df:
what the hell are those "extents" you talk about ? filename extensions ? directories extensions ? inode extensions ? please tell me ?

Posted: **Wed Jun 25, 2003 11:17 am**

NTFS provides the ability to put small data streams (= files) inside that file's FILE record (=inode). Each FILE record is pretty big (1024 bytes), so there's often quite a bit of space free inside it.

Posted: **Thu Jun 26, 2003 4:07 am**

Pype,

Thats a really good idea, exporting the metadata into an xml file for "legacy" filesystems.

I might go desgin a filesystem based on metadata instead of a traditional directory hierarchy. When you have NO support for directories at all it can simplify the design of the file system.

Posted: **Thu Jun 26, 2003 5:45 am**

Tim Robinson wrote: NTFS provides the ability to put small data streams (= files) inside that file's FILE record (=inode). Each FILE record is pretty big (1024 bytes), so there's often quite a bit of space free inside it.

indeed. but if you get a close look, accessing a 10 bytes file (for instance your .forward) will require to read up to 1024 bytes from the disk. It's not only a waste of space, but also a waste of precious I/O time for so little data, and it fits the cache badly.

With their approach of a "database of tiny files" where filenames lead to key which in turn lead to a variable-sized record, you can have dozens of tiny files in a single block. Of course you're still reading a whole block when accessing it, but as there are many other files in it, the caching efficiency will be greatly improved.

This means that in order to have efficient access to small objects, microsoft has to make extra application programming efforts with things like the registry, persistent storages etc. which collect a lot of small object in a few huge files, resulting in more complex apps that hardly offer the same performances.

Posted: **Thu Jun 26, 2003 7:54 am**

Pype.Clicker wrote:indeed. but if you get a close look, accessing a 10 bytes file (for instance your .forward) will require to read up to 1024 bytes from the disk. It's not only a waste of space, but also a waste of precious I/O time for so little data, and it fits the cache badly.

The file system read those 1024 bytes anyway, when it was opening the file. If the file contents wasn't inside the FILE record, it would still read the FILE record anyway, to know where the real data were.

This means that in order to have efficient access to small objects, microsoft has to make extra application programming efforts with things like the registry, persistent storages etc. which collect a lot of small object in a few huge files, resulting in more complex apps that hardly offer the same performances.

Not really. The Registry doesn't attempt to solve the general problem; it's just a hierarchical storage area for small pieces of data (not small and large, which a full file system has to accommodate).

The Docfile format does attempt to solve this to some extent. It is effectively a file system within a file, and sits on top of the file system.

Posted: **Thu Jun 26, 2003 8:53 pm**

I would just change the FS setup to:

/
/system
/bin
/boot
/plugin
/dev
/Tux
*My files mounted here
/Buddy
*Buddy's files mounted here

Also I would like variable file offsets. If a file is 1 byte, then the file only takes 1 byte of memory. (Excluding the file headers)

Posted: **Fri Jun 27, 2003 1:34 am**

Tux wrote: Also I would like variable file offsets. If a file is 1 byte, then the file only takes 1 byte of memory. (Excluding the file headers)

Well, that's what we would all like very much, no? The thing is, hard drives store data in 512 byte chunks (or a multiple of that)... so come around with a way to store 1 byte of data in only one byte of HD space, and you've built a better mousetrap.

Posted: **Fri Jun 27, 2003 12:17 pm**

Actually, it would be possible to some extent. Just consider *.tar balls or *.big files (used by games, e.g. Quake). The problem is that it really doesn't solve anything other than having you end up with a flat, sequential file system within the already existing FS.

Posted: **Tue Dec 09, 2003 6:53 am**

I'm going to add a 45-min post here, so bare with me...

I've thought about making a new file system for a while now, and I've looked through this thread a number of times. Since a lot of us are on the same road, it might even be possible to make a proprietary OS for each, plus a share-between-OSes filesystem that is less dumb than FAT16 or FAT32. F.in. one that even allows encryption and/or compression and/or journalling...

My current concept of a filesystem is a system that takes a number of partitions, merges them virtually (each has its own header, with all but one telling they're no stand-alone-normal-partitions) and then redivides them virtually again.

The idea is that this first layer takes care of all the sector handling, journalling (being done only on the fastest device, being a 20krpm SCSI disk or a flash drive or something similar) and dividing all the content. It also manages strips of free blocks (since in a modern file system, with users that cannot be bothered to keep track of all files, just about all files are not going to be changed anyway) to make contiguous allocation possible, as well as a heap that is used for last-sectors, each which takes less than a whole sector on the heap, and thus slack space is more or less eliminated. Of course, the heap will easily fragment, because all bits have to be aligned. This doesn't matter, since it's only a few bytes per file (around 10 or so, not 32k).

At the base of the second level, there is a filesystem that stores blocks of variable length and inodes. The first level is optimized to place the journal on the fastest disk in duplo (journal is for fast recovery, if the crc of the journal fails, you're still fucked - idea of Stable Storage, read Tanenbaums Modern OSes) and all the files that are very regularly read (OS drivers for noncritical devices, your current documents). It has a daemon running at that level that administers how long files have not been accessed (not linked information, just by-file-basis), and it moves the unused files to a compacted location at the slower end of a disk.

the first level has as final ability that it contains a small minifilesystem without much features, contained in a few megabytes at the start of the core partition, which contains the filesystem layout (telling what partition goes where, and where to find various things), the core directory and things like that.

Note also that the first level does not do directories or file naming.

The second level adds the directory hierarchy, each directory being a file in its own right (even the core directory and root directories). Each directory entry points to an offset in the global filename heap (which can grow) and an inode.

The second level also administers the second half of the inode (kind of kludgy, but it's the best I can do without overloading the first level even more), which points to an XML-based metadata file (thanks for the idea everybody above this one) containing for files information for searching in the file, and for directories information for searching in the directory.

It also compresses, encrypts and authenticates all files that request it, and it enforces the security and quota policy.

The second level has the final requirements that it must contain and administer the root directories, and it must hide the core directory from all layers above it.

At this point there are a number of virtual file systems, each marked with a root directory, a quotum, a number of ACL lists for authentication (ACL lists are also stored in files), meta data for searching, indexing, optimizing, auto-summarizing and all the other things people want to do with it. It's XML, it's extensible.

As far as I can see now, this file system would satisfy me (as for all workstation users I can imagine) for the coming 10 years, at least.

OSDev.org

BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?

Re:BrainStorming: what would you put in a FileSystem?