OSDev.org

Posted: **Wed Jan 18, 2006 4:29 am**

Kemp wrote: I would prefer that soft-links (that's what windows uses for shortcuts right?) were supported. Though doesn't windows just store them as a file that tells it where to go anyway?

AFAIK, windows' soft links are stored as simple, short files with the extension .lnk (which is always hidden by default by the windows explorer). So there is no extra support in the filesystem to add.

Posted: **Wed Jan 18, 2006 4:48 am**

Brendan wrote: I've updated the draft specification again to take into account the recent discussions.
http://bcos.hopto.org/sfs.html

Seems nice to me.
things that will be straightforward to implement
- searching a file from its name
- accessing byte #i from a file (once entry is found)
- tagging a file as "deleted"

things that needs precision
- how to add a new file (and reserve room for it)
- how to return space from a 'tagged-as-deleted' file as "free for use" space.
- how to organize the 'VFS' so that usual "open file (and create), write to file, write to file again, write to file (yet again), close file" sequence leads to a complete file stored on the disk.

things that are totally obscure to me
- how to deal with bad sectors
- what we can do if some sector in the "index area" ever goes bad ...

If a "level 2" specification (e.g. allowing file fragmentation or extending the model to "storage in embedded devices such as digital camera, mp3 recorders, ...") is envisionned, i suggest we already allocate a "level-2 file" entry type that level-1 specification could skip, knowing that they would only be able to read the first fragment (if the user really wants to

Posted: **Wed Jan 18, 2006 8:11 am**

What about making a SSFS (SuperSimpleFS) and a SFS(SimpleFS)? Compatible with each other?
Becasue right now, Brendan is flooded with suggestions that starts to drift away from the _very_ simple ideas he had from the begining. Nothing stops us from making more then one FS, right?

Christoffer

Posted: **Wed Jan 18, 2006 8:29 am**

bubach wrote: Becasue right now, Brendan is flooded with suggestions that starts to drift away from the _very_ simple ideas he had from the begining. Nothing stops us from making more then one FS, right?

You should consider the implications of the page title. A universal file system isn't universal if it isn't like, the only one that's called universal.

I've been playing with the idea however (before, about a few days) and I considered it a good idea. I would call them level 1 for non-fragmented simple access and level 2 for fragmented access, where the same driver would be used and the same structures. The file system indicates in the superblock whether or not fragmented files occur, and drivers attempt their best not to fragment files (at least, don't go making it unreadable if not necessary). Placement of files is done with a variant of best-fit or good-fit.

The compatibility is in lvl1 requiring all files to have a single extent. Each inode has one extent, which is either direct or indirect. If it's direct, no fragmentation, If it's indirect, it's fragmented (and the extent contains the extent list).

That way, it's trivial to check whether a file can be read by a lvl1 file system driver. It cannot write to a lvl2 disk however, since it can't be sure what is or isn't in use.

lvl1 conformance would be indicated by a 32-bit counter at the start that indicates the amount of fragmented files on the file system. If it's 0, there are none, so it can be written by lvl1 drivers.

That was my summary. I get the idea that Brendan's ignoring some of my ideas since they sound complex.

Posted: **Wed Jan 18, 2006 8:45 am**

ah, ok.. sorry.

I get the idea that Brendan's ignoring some of my ideas since they sound complex.

hehe.. ;D

Posted: **Wed Jan 18, 2006 9:52 am**

bubach wrote: ah, ok.. sorry.

That's ok, I'm messing with filesystems too. I'm just not publishing them here (this thread is Brendan's and I'd like to keep it that way so it's a little organized). Plus, I think Brendan's got a good judgement and he was first. So, if you (or me) want to make a different universal filesystem, nobody's going to stop you outright but leave this thread to the development of the filesystem as Brendan is guiding it.

Posted: **Wed Jan 18, 2006 10:00 am**

I like the extension idea. It can support fragmentation, or not support fragmentation, but would it really be worth it? It would be just as easier to read and write to a fragmented file system as a non-fragmented, right?

I think people can understand linked lists. It might and I think it could make coding for the file system much easier. Creating files would be very simple, so would writting files when they need to be made longer.

In the unfragmented system. If I want to write to file, I actually could end up recreating it(easy), or trying to defragment the file system to make room and... all that time I might have to defragment the file multiple times if I dont make enough free room after the file on the first run...

The unfragmented file system does keep things straight forward, but it does require some trickly issues that people likely implementing the FS would ask themselves, and might become flusterated? (mabye) Im asking, not saying.

So, using a node typed file system it could have some nice benefits such as straight forward write, delete, and create functions.

So mabye the fragment system shifted complexity out of code writting, and into the specification but which is easier to learn and do?

- how to organize the 'VFS' so that usual "open file (and create), write to file, write to file again, write to file (yet again), close file" sequence leads to a complete file stored on the disk.

Posted: **Wed Jan 18, 2006 10:37 am**

Hi,

bubach wrote:Becasue right now, Brendan is flooded with suggestions that starts to drift away from the _very_ simple ideas he had from the begining. Nothing stops us from making more then one FS, right?

Flooded a bit, but not really from this topic - I've been testing a Bochs patch for one of the Bochs developers (setting number of CPUs in "bochsrc.txt" instead recompiling!), completing the "text-mode-only" boot manager code (which was left over from previous debugging sessions, so I figured I'd make it useful), preparing the next release of the OS and updating my web pages (and I've just earnt enough XP to raise my character's magic skill in an online game

).

Candy wrote:I get the idea that Brendan's ignoring some of my ideas since they sound complex.

Not ignoring them, but letting them churn around in my subconscious and mingle with things said by DF.

Basically, it would be reasonably easy to set the "version number" in the super-block to 0x20 (or version 2.0), and then use another entry type (e.g. 0x13) for a fragmented file with extents (where the starting and ending block numbers refer to a list of extents instead of the file's data).

This would mean code that supported version 2.0 would be able to work on media in the "version 1.0" format without any changes.

As for the "fragmented file counter", this wouldn't really be needed as the file system driver could maintain it's own internal fragmented file counter (which could be set when the file system is mounted/initialized, done in a lazy way if the code caches index area entry details, or set during part of the "compaction" process). If this counter ever reaches zero, then the file system driver could automatically change the version number in the super-block back to 0x10.

[Side-note: I'm trying to minimize the amount of stuff in the super-block, as this sector is shared with the boot sector and it's hard enough fitting enough code in a boot sector already. This includes things like the volume label..]

The problem here (IMHO) is that the code to do "compaction" would also need to defragment where it can, and finding unused blocks (within the data area) becomes much harder. To make these things easier you'd start thinking about adding a used block bitmap, but you can't "atomically" update the used block bitmap and the super-block a the same time, so it couldn't be implemented in simple, fail-safe way. To fix this you end up with some form of journaling, and before you know it we've re-created ext3.

I'm not against going down this path in general, but it's probably not such a good idea for the first version of a "Simple" file system, and to be honest, I'd rather get the first version done before worrying much about the second version.

Cheers,

Brendan

Posted: **Wed Jan 18, 2006 11:10 am**

Hi,

Pype.Clicker wrote: things that needs precision
- how to add a new file (and reserve room for it)
- how to return space from a 'tagged-as-deleted' file as "free for use" space.
- how to organize the 'VFS' so that usual "open file (and create), write to file, write to file again, write to file (yet again), close file" sequence leads to a complete file stored on the disk.

things that are totally obscure to me
- how to deal with bad sectors

These things will be clarified when I write the missing "Algorithms" section...

Pype.Clicker wrote: - what we can do if some sector in the "index area" ever goes bad ...

Currently, if a sector in the index area is unreadable then all "file entries" stored within it (and the file data they point to) would be lost, but all other files stored on the media would still be readable. All other types of entries in the index area can be lost without effecting file recovery. After recovering what you can (if you actually need to), throw the media in the bin (same as I do now with floppies and writable CDs).

Pype.Clicker wrote:If a "level 2" specification (e.g. allowing file fragmentation or extending the model to "storage in embedded devices such as digital camera, mp3 recorders, ...") is envisionned, i suggest we already allocate a "level-2 file" entry type that level-1 specification could skip, knowing that they would only be able to read the first fragment (if the user really wants to

With the extents idea, the problem would be that the version 1.0 code won't know where the version 2.0 code has put file data. If the "level-2 file" entry type is skipped then level-2 file data will be overwritten. In this case it's better if the version 1.0 code refuses to mount the file system (or possibly, refuses to write to the file system and refuses to read some files).

Cheers,

Brendan

Posted: **Wed Jan 18, 2006 11:18 am**

I think we should get a first "final" spec done, code and test it then start work on another version (if needed).

By thinking about every possible feature we'd like, it will change from a "simple" filesystem to something more complex than initially intended.

Key thing here - it is simple and works.

Posted: **Wed Jan 18, 2006 11:40 am**

Brendan wrote:
Pype.Clicker wrote: - what we can do if some sector in the "index area" ever goes bad ...
Currently, if a sector in the index area is unreadable then all "file entries" stored within it (and the file data they point to) would be lost, but all other files stored on the media would still be readable. All other types of entries in the index area can be lost without effecting file recovery. After recovering what you can (if you actually need to), throw the media in the bin (same as I do now with floppies and writable CDs).

okay. i guess copies of the index area could be written to regular files if _really_ wanted.

With the extents idea, the problem would be that the version 1.0 code won't know where the version 2.0 code has put file data.

well, as long as "regular level-1" operations are performed, that's not a concern, is it? i mean, as long as extents allocation have updated the "size of data area", i think writing new level-1 files couldn't harm. And at least, by making explicit type for "level-2 file entry" at level-1, we can guarantee the absence of oddities such as two files with the same name, one for "level-1", and the other as "level-2".

If the "level-2 file" entry type is skipped then level-2 file data will be overwritten.

i don't exactly see how, except if a level-1 "disk compaction" tool is applied on a medium containing level-2 files... but this is the only pathological case i can see ...

@slasher

By thinking about every possible feature we'd like, it will change from a "simple" filesystem to something more complex than initially intended.

Yep. I apologize to anyone getting the feeling that i'm slowing down the process here ... i just want to make sure we don't need to replace 1.0 by 1.1 when we introduce 2.0

Be assure that i appreciate the initiative and Brendan's efforts to put it altogether

Posted: **Wed Jan 18, 2006 12:20 pm**

Hi,

Pype.Clicker wrote:
With the extents idea, the problem would be that the version 1.0 code won't know where the version 2.0 code has put file data.
well, as long as "regular level-1" operations are performed, that's not a concern, is it? i mean, as long as extents allocation have updated the "size of data area", i think writing new level-1 files couldn't harm. And at least, by making explicit type for "level-2 file entry" at level-1, we can guarantee the absence of oddities such as two files with the same name, one for "level-1", and the other as "level-2".

If the "level-2 file" entry type is skipped then level-2 file data will be overwritten.
i don't exactly see how, except if a level-1 "disk compaction" tool is applied on a medium containing level-2 files... but this is the only pathological case i can see ...

It'd depend on how the file system code is implemented. For example, when data is appended to a file an extremely simple implementation could just copy the file's existing data to the end of the data area so that it can append the new data without problems.

A more optimized implementation could try to use free space between files. For example, if the data area looks like:

[tt]----
11111
111--
--222
-----
-----[/tt]

Then if data needs to be appended to "file 1", it could try to do:

[tt]----
11111
11111
11222
-----
-----[/tt]

When it runs out of space it could even try to shift the smaller "file 2" out the way:

[tt]222-
11111
11111
11111
11111
1111-[/tt]

In any case, most implementations (except for the extremely simple ones) would automatically try to do "compaction" whenever free space runs out - it wouldn't always be a seperate utility, like DOS defrag.

It would be possible to complete the design of "version 1.0", then complete the design of "version 2.0", and then go back and redesign "version 1.0" to avoid compatibility problems like this. I'm willing to try this if people want, but I've got a feeling "version 2.0" might involve more than just extents (and I wouldn't be surprised if something added to version 2.0 makes "forward compatibility" impractical).

Cheers,

Brendan

Posted: **Wed Jan 18, 2006 12:28 pm**

kmcguire wrote: I like the extension idea. It can support fragmentation, or not support fragmentation, but would it really be worth it? It would be just as easier to read and write to a fragmented file system as a non-fragmented, right?

Having nonfragmented files gives a lot cleaner and simpler code. Reading from the file is making a read from start+offset, not some complex method that has to take all sorts of limits into account. You can optimize it by buffering, but that's not for embedded systems which is also nice to support. Also, if it's not necessary, why bother?

I propose to add:

One bit at the start of an inode that indicates whether it's fragmented. If it is fragmented, the old driver doesn't read it.
One variable at the superblock indicating the amount of fragmented files. If it's >0 the old driver doesn't write to the filesystem at all.

I propose to change:

The file length and start/end block information. In the case of a fragmented file, there are (length, start) blocks in the data portion indicated here (start/end block). These blocks indicate where the actual data is. In the case of an unfragmented file, these hold their original intent.

In this way, making a driver for a new OS doesn't need to do any chain following, free space management or anything like that. You just read inodes, read linear data and you're done. For writing, you make an extent list of free blocks, find one that fits, fill in the data structures and that's it again. That's what I figured would be a simple FS.

Then you get the second level. It's the same, but there's the possibility of indirect adressing. Making the initial block list is the original algorithm, expanded with reading indirect blocks. Reading files is the same, writing files boils down to the same again. You don't care about the value at the begin of the FS, but you do keep it up to date.

These two can always cooperate.

Then you need a defragmenter, for those cases there's something wrong or somebody wants to defragment it. Defragmenting a non-critical filesystem is pretty simple: 1. round up all fragments of files and make whole files, 2. move files that can fit in an earlier gap there. End of defragmenting. Optional no3 would be abusing little files and complex algorithms to fill in the remaining gaps, although they're probably not very important to fill (they're only to be used when the FS is nearly full, in which case you're screwed anyway).

In the unfragmented system. If I want to write to file, I actually could end up recreating it(easy), or trying to defragment the file system to make room and... all that time I might have to defragment the file multiple times if I dont make enough free room after the file on the first run...

The point behind the two levels is that the first is as simple as can be, while maintaining some required functions, and the second is normal in terms of what people expect (you can't just get an error for nothing), fairly simple and 100% compatible with the first. So, the first type is useful to make and can grow into the second type. You can test with the first one easily since it doesn't need much space or structures in memory to work (some 100 lines of C code should suffice, perhaps 1000 in assembly).

Keep the file system simple by just plain complaining when the disk is "full". Don't bother implementing the warnings - if you're serious about support, make it lvl2-compliant. In that case, the warnings and complexity is moot.

I see the levels as such:
1. Very simple for beginning OSes, very simple and compact applications such as embedded and that's it.
2. For any normal application. Level 1 is a stepping stone to implementing it which you can fully test and use. It just has quite some limits. Those limits mostly involve very-hard-or-slow-stuff, which lvl2 fixes (using fragmentation).

Example for level 1: My OS in a few months (disk driver testing method), my OS' initial boot image, DennisOS, Bubach's OS for fs testing? Bootloader testing?
Level 2: Any real OS, any "grown-up" driver, photo camera's, flash disks etc.

Posted: **Thu Jan 19, 2006 2:41 am**

Brendan wrote: A more optimized implementation could try to use free space between files. For example, if the data area looks like:
<fig/>
Then if data needs to be appended to "file 1", it could try to do:
<fig/>

When it runs out of space it could even try to shift the smaller "file 2" out the way:
<fig/>

Wooo ... seems to have magic there i had underestimated.

Basically, that means the driver is constructing a "bitmap" (or even something like a "fat") internally, from the content of the "index area" so that it can detect whether blocks around a growing file are free or not, right ? I suppose that makes more sense for the "unusable entries" (which could be FAT files or whatever that we shouldn't mess with).

That suggest the following possible way of actions to me:
- a L1 driver cannot mount a L2 media, or at least it cannot write to it.
- (or) if a L1 driver mounts a L2 media, it should _not_ try to perform such kind of "assumptions" that areas not pointed by an entry in the "index" are free.
- (or) L2 drivers should leave L1 drivers information on the media about the "extended fragments" it allocated, for instance by "unused" entries or by the mean of a file containing "reserved sectors" (but that not camel-and-stick)

Candy wrote: I propose to add:

One bit at the start of an inode that indicates whether it's fragmented. If it is fragmented, the old driver doesn't read it.

uh ? so far, we have no inodes and no plan to have them in UFS ... do we ? or do you mean "index_entry->type & FRAGMENTED" ?

One variable at the superblock indicating the amount of fragmented files. If it's >0 the old driver doesn't write to the filesystem at all.

i'd rather say "the whole data area is considered used and new data are only written to the free area".

Posted: **Thu Jan 19, 2006 2:47 am**

Pype.Clicker wrote: That suggest the following possible way of actions to me:
- a L1 driver cannot mount a L2 media, or at least it cannot write to it.
- (or) if a L1 driver mounts a L2 media, it should _not_ try to perform such kind of "assumptions" that areas not pointed by an entry in the "index" are free.
- (or) L2 drivers should leave L1 drivers information on the media about the "extended fragments" it allocated, for instance by "unused" entries or by the mean of a file containing "reserved sectors" (but that not camel-and-stick)

I would vote for the first, or at least not letting it write to anything it's not entirely certain that it's free.

uh ? so far, we have no inodes and no plan to have them in UFS ... do we ? or do you mean "index_entry->type & FRAGMENTED" ?

Inode, index-entry, whatever you call them. The thingymabob that contains the file information.

i'd rather say "the whole data area is considered used and new data are only written to the free area".

If you have a separate free area, that's OK. You could then allow an even smaller and slightly less functional level1 driver that only allocates in the free area.

OSDev.org

Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System

Re:Universal File System