Page 1 of 1

Which file system has highest performance?

Posted: Fri Apr 20, 2012 10:12 pm
by pauldinhqd
I'm reading the theories of file systems and considering some
of them for high performance. (read-only file systems excluded).

Currently I'm thinking of 2 matters:
(1) Performance on file finding (best speed when looking for a file with its name)
(2) Performance on file internal-seeking (best speed when seek to a byte in file)

Any other criteria on file system performance :?:
And with those criteria, which FS gives the highest power :?:

Re: Which file system has highest performance?

Posted: Fri Apr 20, 2012 11:05 pm
by gerryg400
pauldinhqd wrote:I'm reading the theories of file systems and considering some
of them for high performance. (read-only file systems excluded).

Currently I'm thinking of 2 matters:
(1) Performance on file finding (best speed when looking for a file with its name)
(2) Performance on file internal-seeking (best speed when seek to a byte in file)

Any other criteria on file system performance :?:
And with those criteria, which FS gives the highest power?
Firstly, the perceived performance of a filesystem is affected by the implementation. For example it is affected greatly by the cache that sits above the filesystem. Having said that look for the following.

(1) Some filesystems store directory entries in a linear array or linked list. This means that to find a file requires a linear search and this is not ideal. Look for a FS that uses for example a hashing mechanism to find files quicker.

(2) Some filesystems store the data contents of a file in a linked list of clusters. FAT is like this. It can be very slow seeking to the end of a file. Modern file systems use various methods to help here. For example EXT uses the tradition Unix method with blocks, indirect blocks and double-indrect blocks.

Re: Which file system has highest performance?

Posted: Sat Apr 21, 2012 11:47 am
by turdus
For filesystems one good measure on file read is how many additional sectors has to be read. For example in a small file (less than sectorsize) it's one (the inode). For medium file (not bigger than sectorsize*sectorsize) you have to use indirect blocks which means random access requires 2 additional reads (inode+indirect). Nowdays modern filesystems try to minimize this. Therefore most modern filesystems tend to use b-tree and extents. First was XFS (if I remember well), that ZFS and btrfs followed.

Benefits:
- fast lookup on filenames (b-tree)
- small size of metainfo (well at least after fs creation)

Disadvantages:
- could be terribly slow after time, metainfo can grow quickly
- quite a lot cpu work for keeping the metainfo consistent and small

The extents were developed to avoid the need of loading additional sectors (store allocation info of really big files right in inode), but it introduced a new bottleneck.
Extents turned out to be unuseful in real life apps (for heavy load servers at least). An easy method the see the disadvantage is:
1. create an fs with btrfs for example
2. create an sqlite database file with "tbl" table in it
3. insert about 1 million records into that tbl
4. now measure time for "select * from tbl;", save the result as x
5. execute several ten million random updates (the more updates you use the easier to inspect the disadvantage)
6. measure again "select * from tbl;", let's say y
You can see that y is considerably bigger than x. It's because extents became fragmented and uncontinous. You will have to spend a lot of cpu time to defragment them, and it's not guaranteed you'll succeed (extents could still remain fragmented if they were not neighbors).

So the question is, measure an fs for what? There's no such thing, "highest power".

Re: Which file system has highest performance?

Posted: Sat Apr 21, 2012 11:51 pm
by pauldinhqd
turdus wrote: So the question is, measure an fs for what? There's no such thing, "highest power".
I meant to say about the "most powerful" FS,
possible if one FS is strong in a certain features, it's weak in others.
geryg400 wrote: (1) Some filesystems store directory entries in a linear array or linked list. This means that to find a file requires a linear search and this is not ideal. Look for a FS that uses for example a hashing mechanism to find files quicker.

(2) Some filesystems store the data contents of a file in a linked list of clusters. FAT is like this. It can be very slow seeking to the end of a file. Modern file systems use various methods to help here. For example EXT uses the tradition Unix method with blocks, indirect blocks and double-indrect blocks.
FATx uses 'table' for directory structure, and 'linked-list' for block allocation.
NTFS uses 'btree' for directory structure, and 'bitmap' for block allocation.

NTFS seems better than FAT coz 'btree' allows find searching at O(logN) :?:
and random-access to allocation info :?:
And both table & linked-list requires sequential search :?:

Re: Which file system has highest performance?

Posted: Sun Apr 22, 2012 1:55 am
by bluemoon
IMO these are some of the nice ideas from modern FS design:

It maintain hot-zone(s)
On-disk caching the lookup of most visited files, boot files, specially interested files; some kind of search hints
group the hot files together for quick loading

it can handle file grown nicely
some files, for example log, slowly grown in time. It is usually true for copy very big files as well, they grow from small to bigger.
Imagine multiple log files are growing, or perform multiple file copy operation in parallel(eg. app installation or patch), it will be disaster for FAT.
By delayed write(in system architecture), aggressive pre-allocation, put gaps between allocation, etc. to minimize fragment

redundancy / fail recovery
Imagine there is power failure or system crash when doing file operation...

Re: Which file system has highest performance?

Posted: Sun Apr 22, 2012 2:29 am
by turdus
bluemoon wrote:IMO these are some of the nice ideas from modern FS design:

It maintain hot-zone(s)
On-disk caching the lookup of most visited files, boot files, specially interested files; some kind of search hints
group the hot files together for quick loading

it can handle file grown nicely
some files, for example log, slowly grown in time. It is usually true for copy very big files as well, they grow from small to bigger.
Imagine multiple log files are growing, or perform multiple file copy operation in parallel(eg. app installation or patch), it will be disaster for FAT.
By delayed write(in system architecture), aggressive pre-allocation, put gaps between allocation, etc. to minimize fragment

redundancy / fail recovery
Imagine there is power failure or system crash when doing file operation...
I agree, have only one point to add:
data integrity
checksums for metainfo as well for data. As capacity grows, there's a higher chance to face data corruption (by cosmic radiation for example).

Re: Which file system has highest performance?

Posted: Sun Apr 22, 2012 3:30 am
by pauldinhqd
@bluemoon, @turdus
lookup-cache & redundency check are interesting and important points to note down
these are quite advanced features of a FS :)