Hi,
Beastie wrote:Can any one plz tell the steps the kernel (*NIX) should do after the execution of open syscall (communication between VFS & FS) ???
I'll be thankful for the answer
The simple answer would be to traverse a tree of entries (while keeping track of details for the current file system), asking the file system for more data when necessary, until you reach the correct entry. Then mark that entry as "opened" somehow and build a structure that describes how it's opened.
For example, for the file "/dir1/dir2/dir3/foo.txt" the VFS would start at "/" and the root file system. If the entry for "/" says none of the directory contents are in the tree then it'd ask the current file system for a listing of everything in the "/" directory and add this information into the tree (possibly getting rid of other entries to make space). Then it'd search for "dir1".
If it finds "dir1" it'd check if it's a mount point, and if it is a mount point it'd set a new current file system and a new current mount point name. If the entry for "/dir1" says none of the directory contents are in the tree then it'd ask the current file system for a listing of everything in the "/dir1" directory (after removing the current mount point name from the "/dir1" string) and add this information into the tree (possibly getting rid of other entries to make space). Then it'd search for "dir2".
If it finds "dir2" it'd check if it's a mount point, and if it is a mount point it'd set a new current file system and a new current mount point name. If the entry for "/dir2" says none of the directory contents are in the tree then it'd ask the current file system for a listing of everything in the "/dir2" directory (after removing the current mount point name from the "/dir2" string) and add this information into the tree (possibly getting rid of other entries to make space). Then it'd search for "dir3".
If it finds "dir3" it'd check if it's a mount point, and if it is a mount point it'd set a new current file system and a new current mount point name. If the entry for "/dir3" says none of the directory contents are in the tree then it'd ask the current file system for a listing of everything in the "/dir3" directory (after removing the current mount point name from the "/dir3" string) and add this information into the tree (possibly getting rid of other entries to make space). Then it'd search for "foo.txt".
Um, it's recursive...
Some notes:
1) before the VFS does anything it'd sanitize the path name to create a unique absolute path. Something like "/a/../b/../c/d/../foo.txt" would be converted into "/c/foo.txt", and something like "~/bar.txt" might be converted into "/home/Brendan/bar.txt". If the application tries to create a new file called "/foo/*?*?*" then they get a bad filename error.
2) You'd keep track of the current mount point name, so that if the VFS is looking at "/dir1/dir2/dir3" and "/dir1/dir2" happens to be the most recent mount point, then it'd ask the file system mounted at "/dir1/dir2" about the directory "/dir3" (and not the directory "/dir1/dir2/dir3").
2) If at any step the next piece of the path is not found, then return "file not found".
3) There's file permission checks in there somewhere.
4) The VFS needs to handle "mount" and "unmount" too.
5) For fun, use hashes to speed it up (convert string into a number/hash, compare the number/hash with the number/hash stored in each entry in the directory, if numbers/hashes match compare strings to double check).
6) To speed it up more optimize for cache locality. Only put data you need in the tree (address of a list of children and name hash) and for each entry in the tree have a pointer to further information (name string, permissions, owner, group, etc). Then keep the entries in the tree seperate from everything else (e.g. have two seperate heaps, one for tree entries and another for everything else).
When you find the correct entry (e.g. "/dir1/dir2/dir3/foo.txt" if an existing file is being opened, or possibly just "/dir1/dir2/dir3/" if the file is being created), create a new file handle. Add the file handle to a list of all file handles that have opened the file and keep track of how the file was opened so that file sharing works. For example, (usually) a file can be opened many times as "read only" but can only be opened once with write access. Some OS's support "append" where you can add data to the end of the file while other people read.
Some more notes:
7) All decent OSs do all the above asynchronously - build a structure describing the state of each request and come back to the request later if you can't complete it immediately (e.g. if you need to wait for the file system to do anything).
8 ) All decent OSs support notifications. Software can ask the VFS to let it know if a file or directory is modified, and the VFS does.
9) Some OSs support I/O priorities. For example, if a low priority thread wants to open "foo.txt" and a high priority thread wants to open "bar.txt" then the VFS doesn't do anything for "foo.txt" if it can do something for "bar.txt". I/O priorities should extend to file systems and device drivers too. For example, if a low priority request to read 50 MB of data from the disk is in progress and a high priority request to write 4 KB of data arrives, then the high priority request should preempt the low priority request. Imagine the high priority request is the swap space and the low priority request is defrag and you'll see why.
10) File systems may not be file systems - consider "/proc" and "/dev".
11) You'll want to keep track of "least recently used" entries so you can free up space when you need to.
12) You'll eventually want to find some way of balancing RAM, VFS cache and swap space - I imagine this is hard to do right. For e.g. if the VFS cache has data that was last used 10 seconds ago and a thread has data that was last used 8 seconds ago, then it might be better to send the thread's data to swap space (even though it's not "least recently used") if you can read it back from swap faster than you can get the VFS cache data back.
13) You might want to cache file data too. Resist the urge until everything else works perfectly (it's complicated enough already).
14) Don't forget that a lot of things depends on you and your OS!

15) There's lots of stuff you could add - versioning, fault tolerance, sparse files, encryption, compression, search and indexing, meta-data, snapshots/rollbacks, etc.
16) Don't forget I made this all up while I typed...
Cheers,
Brendan