Need advices for designing a simple file system

Brendan · Post by **Brendan** » Fri Apr 27, 2012 10:32 pm

Hi,

pauldinhqd wrote:This big-O is based on the number of byte-to-byte comparisons.
Am I getting this right

No. It's based on the number of operations, while ignoring what type of operations and how expensive the operations are. A "cmp byte [buffer],0" instruction might be one operation and loading a sector from a floppy disk might be one operation; so loading 10 sectors from floppy (something that probably takes 2 seconds) is "as fast as" doing 10 comparisons (which probably takes a few nanoseconds). See how idiotic that is?

When attempting to use "big O notation" for file systems; maybe it'd make more sense to break operations up according to type. For example, one file system might be "O(log n) seeks, O(log n) reads and O(log m) comparisons", and another file system might be "O(1) seeks, O(n) reads and O(m) comparisons". From this it's a lot easier to tell that the first file system is going to be better for something like SSD (where seek is virtually free) and that the second file system is going to be better for hard disk (where seek is expensive).

To get even more accurate comparisons you could break the types of operations up even further - maybe "seeks to a random location" and "seeks to a sector that's likely to be close to where the disk heads are", and "reads with rotational latency costs" and "subsequent reads without rotational latency costs", etc.

If you attempt to do it in a correct/sane way, then you end up creating formulas that can be used to estimate the actual costs. For example, you could benchmark the cost of each type of operation for different devices and plug those costs into your formulas to generate something like an "estimated performance curve graph" that clearly shows how each file system being compared performs for each type of device.

If you attempt to do it in a correct/sane way, then you end up not using "big O notation" at all.

Cheers,

Brendan

gerryg400 · Post by **gerryg400** » Fri Apr 27, 2012 10:55 pm

Don't forget also that most OS's have a number of caches between the OS and the on disk filesystem structures. Typically these include a name cache (or dentry_cache in the Linux world), an inode cache and a block cache. The name cache may hold thousands of entries and the result is that the on-disk structure used to store directories is not as important as the in-memory cache structure that holds directories.

pauldinhqd · Post by **pauldinhqd** » Sat Apr 28, 2012 2:00 pm

I see berkus & Brendan mentioned earlier about patricia trie (radix trie).
Patricia trie is better or B-tree is better

It seems that trie takes a lot of space. However, storing the directory structure
as trie is much more human-friendly than btree I think

,
and trie is fast also.

pauldinhqd · Post by **pauldinhqd** » Mon Apr 30, 2012 1:16 am

I'm implementing the prefix tree (trie) for my file system folder structure,
a single node has the following struct (allowing 127 characters, 0x01 to 0x8F):

Code: Select all

#pragma pack(1)

//implementation 1:
//1 + 127*8 + 7 = 1KB
struct {
  char Character;
  unsigned long long Children[127];
  char Reserved[7];
};

However, the above struct has problem with its size.
Let's say we have 1 million files on file system,
and the maximum length for a full file path is 1 thousand characters;
these values are quite realistic on a 1TB harddrive.

Then the number of nodes in the prefix tree is around a half of 1 million * 1 thousand.
This is almost 1 billion nodes.

The total space needed to store the prefix tree (containing folder structure)
on disk is: 1 billion * 1KB = 1TB. This 1TB is taking the whole harddrive!!!

Another option is using the 'next sibling' and 'first child' for a node,
like following:

Code: Select all

#pragma pack(1)

//implementation 2:
//1 + 8 + 8 + 7 = 24 bytes
struct {  
  char Character;
  unsigned long long Next_Sibling;
  unsigned long long First_Child;
  char Reserved[7];
};

But this method doesn't support ordering the nodes alphabetatically.
With every insertion of a node, I have to scan thru' all levels of the prefix tree
to find an appropriate position to put the new node in.
Should I use the implementation 1 or implementation 2

StevenDilley · Post by **StevenDilley** » Tue Jun 05, 2012 7:21 am

Hello friends,

A file system is a core component of most operating systems and the implementation of this system will provide experience designing complex systems, one of the major topics in 6.033. File systems, by their nature, incorporate aspects of naming, fault-tolerance, concurrency control, and storage management. In addition, issues of security and networking appear in more ambitious designs.

Best regards
Steven Dilley

OSwhatever · Post by **OSwhatever** » Tue Jun 05, 2012 1:14 pm

I had to design a simple file system of my own. I used B+tree as fundamental structure very much like Btrfs (basically copied the core design). B+tree is a very flexible structure and makes it easy to add features afterwards. Performance wise I would assume using B+tree for everything is slower than using the traditional way with inodes with direct and indirect blocks. Regardless of this I think B+tree is so convenient to work with that advantages outweighs the disadvantages. At least for the simple use case I have.

Don't get too blinded by performance. VFAT is faster than NTFS, yet almost everybody prefer NTFS. Ext4 is faster than Btrfs but in time the features of Btrfs will make more people switch to it.

amd64pager · Post by **amd64pager** » Tue Jun 05, 2012 10:39 pm

Keeping two different lists of used blocks and unused blocks might work,a bit more efficient than bitmap since you only need to read the free block list's first element.
This will be a bit fast(slow in deallocation but),since allocating a block is like this:

Read in the free list's first entry
Put it in the used blocks list and delete it from the free blocks list

and deallocating like this:

Delete the blocks entry in the used blocks list
Put it in the free blocks list

This will make it a bit easier,but you will have to find how to keep the list (You can use fixed size tables or a linked list with the block numbers,next list data sector number and last sector data sector number - but the latter will need you to allocate also , so for that reason always keep one empty element in the used block list's last sector)

amd64pager · Post by **amd64pager** » Wed Jun 06, 2012 5:54 am

berkus wrote:
amd64pager wrote:This will make it a bit easier,but you will have to find how to keep the list
Keeping it in btrees sorted by size (and maybe block location) sounds like a good idea.

Keeping a old concept in a new container

amd64pager · Post by **amd64pager** » Wed Jun 06, 2012 6:40 am

berkus wrote:Eventually, you will need to find a block of appropriate size...

Nice infinite loop,isn't it (lol) - in fact I couldn't quote your whole post.

amd64pager · Post by **amd64pager** » Thu Jun 07, 2012 5:22 am

berkus wrote:
amd64pager wrote:
berkus wrote:Eventually, you will need to find a block of appropriate size...
Nice infinite loop,isn't it (lol)
Not at all, you think of fixed size blocks, I think of extents.

Keeping the extent data?

amd64pager · Post by **amd64pager** » Fri Jun 08, 2012 1:19 am

berkus wrote:
amd64pager wrote: Keeping the extent data?
There's no data in free extents, is there?

Used extents?

Well,I think we must stop.

amd64pager · Post by **amd64pager** » Fri Jun 08, 2012 4:23 am

berkus wrote:
amd64pager wrote:Used extents?
Yes, the ranges of blocks belonging to files and containing the file data. Used extents, y'know?

I mean managing the used extents list.

amd64pager · Post by **amd64pager** » Sat Jun 09, 2012 1:38 am

Just like page directories

OSDev.org

Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system

Re: Need advices for designing a simple file system