I'm reading the theories of file systems and considering some
of them for high performance. (read-only file systems excluded).
Currently I'm thinking of 2 matters:
(1) Performance on file finding (best speed when looking for a file with its name)
(2) Performance on file internal-seeking (best speed when seek to a byte in file)
Any other criteria on file system performance
And with those criteria, which FS gives the highest power
Which file system has highest performance?
- pauldinhqd
- Member
- Posts: 37
- Joined: Tue Jul 12, 2011 9:14 am
- Location: Hanoi
- Contact:
Which file system has highest performance?
Last edited by pauldinhqd on Fri Apr 20, 2012 11:40 pm, edited 1 time in total.
AMD Sempron 140
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE
Re: Which file system has highest performance?
Firstly, the perceived performance of a filesystem is affected by the implementation. For example it is affected greatly by the cache that sits above the filesystem. Having said that look for the following.pauldinhqd wrote:I'm reading the theories of file systems and considering some
of them for high performance. (read-only file systems excluded).
Currently I'm thinking of 2 matters:
(1) Performance on file finding (best speed when looking for a file with its name)
(2) Performance on file internal-seeking (best speed when seek to a byte in file)
Any other criteria on file system performance
And with those criteria, which FS gives the highest power?
(1) Some filesystems store directory entries in a linear array or linked list. This means that to find a file requires a linear search and this is not ideal. Look for a FS that uses for example a hashing mechanism to find files quicker.
(2) Some filesystems store the data contents of a file in a linked list of clusters. FAT is like this. It can be very slow seeking to the end of a file. Modern file systems use various methods to help here. For example EXT uses the tradition Unix method with blocks, indirect blocks and double-indrect blocks.
If a trainstation is where trains stop, what is a workstation ?
Re: Which file system has highest performance?
For filesystems one good measure on file read is how many additional sectors has to be read. For example in a small file (less than sectorsize) it's one (the inode). For medium file (not bigger than sectorsize*sectorsize) you have to use indirect blocks which means random access requires 2 additional reads (inode+indirect). Nowdays modern filesystems try to minimize this. Therefore most modern filesystems tend to use b-tree and extents. First was XFS (if I remember well), that ZFS and btrfs followed.
Benefits:
- fast lookup on filenames (b-tree)
- small size of metainfo (well at least after fs creation)
Disadvantages:
- could be terribly slow after time, metainfo can grow quickly
- quite a lot cpu work for keeping the metainfo consistent and small
The extents were developed to avoid the need of loading additional sectors (store allocation info of really big files right in inode), but it introduced a new bottleneck.
Extents turned out to be unuseful in real life apps (for heavy load servers at least). An easy method the see the disadvantage is:
1. create an fs with btrfs for example
2. create an sqlite database file with "tbl" table in it
3. insert about 1 million records into that tbl
4. now measure time for "select * from tbl;", save the result as x
5. execute several ten million random updates (the more updates you use the easier to inspect the disadvantage)
6. measure again "select * from tbl;", let's say y
You can see that y is considerably bigger than x. It's because extents became fragmented and uncontinous. You will have to spend a lot of cpu time to defragment them, and it's not guaranteed you'll succeed (extents could still remain fragmented if they were not neighbors).
So the question is, measure an fs for what? There's no such thing, "highest power".
Benefits:
- fast lookup on filenames (b-tree)
- small size of metainfo (well at least after fs creation)
Disadvantages:
- could be terribly slow after time, metainfo can grow quickly
- quite a lot cpu work for keeping the metainfo consistent and small
The extents were developed to avoid the need of loading additional sectors (store allocation info of really big files right in inode), but it introduced a new bottleneck.
Extents turned out to be unuseful in real life apps (for heavy load servers at least). An easy method the see the disadvantage is:
1. create an fs with btrfs for example
2. create an sqlite database file with "tbl" table in it
3. insert about 1 million records into that tbl
4. now measure time for "select * from tbl;", save the result as x
5. execute several ten million random updates (the more updates you use the easier to inspect the disadvantage)
6. measure again "select * from tbl;", let's say y
You can see that y is considerably bigger than x. It's because extents became fragmented and uncontinous. You will have to spend a lot of cpu time to defragment them, and it's not guaranteed you'll succeed (extents could still remain fragmented if they were not neighbors).
So the question is, measure an fs for what? There's no such thing, "highest power".
- pauldinhqd
- Member
- Posts: 37
- Joined: Tue Jul 12, 2011 9:14 am
- Location: Hanoi
- Contact:
Re: Which file system has highest performance?
I meant to say about the "most powerful" FS,turdus wrote: So the question is, measure an fs for what? There's no such thing, "highest power".
possible if one FS is strong in a certain features, it's weak in others.
FATx uses 'table' for directory structure, and 'linked-list' for block allocation.geryg400 wrote: (1) Some filesystems store directory entries in a linear array or linked list. This means that to find a file requires a linear search and this is not ideal. Look for a FS that uses for example a hashing mechanism to find files quicker.
(2) Some filesystems store the data contents of a file in a linked list of clusters. FAT is like this. It can be very slow seeking to the end of a file. Modern file systems use various methods to help here. For example EXT uses the tradition Unix method with blocks, indirect blocks and double-indrect blocks.
NTFS uses 'btree' for directory structure, and 'bitmap' for block allocation.
NTFS seems better than FAT coz 'btree' allows find searching at O(logN)
and random-access to allocation info
And both table & linked-list requires sequential search
AMD Sempron 140
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE
Re: Which file system has highest performance?
IMO these are some of the nice ideas from modern FS design:
It maintain hot-zone(s)
On-disk caching the lookup of most visited files, boot files, specially interested files; some kind of search hints
group the hot files together for quick loading
it can handle file grown nicely
some files, for example log, slowly grown in time. It is usually true for copy very big files as well, they grow from small to bigger.
Imagine multiple log files are growing, or perform multiple file copy operation in parallel(eg. app installation or patch), it will be disaster for FAT.
By delayed write(in system architecture), aggressive pre-allocation, put gaps between allocation, etc. to minimize fragment
redundancy / fail recovery
Imagine there is power failure or system crash when doing file operation...
It maintain hot-zone(s)
On-disk caching the lookup of most visited files, boot files, specially interested files; some kind of search hints
group the hot files together for quick loading
it can handle file grown nicely
some files, for example log, slowly grown in time. It is usually true for copy very big files as well, they grow from small to bigger.
Imagine multiple log files are growing, or perform multiple file copy operation in parallel(eg. app installation or patch), it will be disaster for FAT.
By delayed write(in system architecture), aggressive pre-allocation, put gaps between allocation, etc. to minimize fragment
redundancy / fail recovery
Imagine there is power failure or system crash when doing file operation...
Re: Which file system has highest performance?
I agree, have only one point to add:bluemoon wrote:IMO these are some of the nice ideas from modern FS design:
It maintain hot-zone(s)
On-disk caching the lookup of most visited files, boot files, specially interested files; some kind of search hints
group the hot files together for quick loading
it can handle file grown nicely
some files, for example log, slowly grown in time. It is usually true for copy very big files as well, they grow from small to bigger.
Imagine multiple log files are growing, or perform multiple file copy operation in parallel(eg. app installation or patch), it will be disaster for FAT.
By delayed write(in system architecture), aggressive pre-allocation, put gaps between allocation, etc. to minimize fragment
redundancy / fail recovery
Imagine there is power failure or system crash when doing file operation...
data integrity
checksums for metainfo as well for data. As capacity grows, there's a higher chance to face data corruption (by cosmic radiation for example).
- pauldinhqd
- Member
- Posts: 37
- Joined: Tue Jul 12, 2011 9:14 am
- Location: Hanoi
- Contact:
Re: Which file system has highest performance?
@bluemoon, @turdus
lookup-cache & redundency check are interesting and important points to note down
these are quite advanced features of a FS
lookup-cache & redundency check are interesting and important points to note down
these are quite advanced features of a FS
AMD Sempron 140
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE
nVidia GTS 450
Transcend DDR2 2x1
LG Flatron L1742SE