OSDev.org

Posted: **Thu Mar 01, 2012 10:34 am**

Hi again

I have just read a tutorial about virtual file systems and how it abstracts the actual file system. What I understand so far is that having callbacks to the actual file system functions in the file node structure seems like a good idea.

So there are two abstraction layers, the VFS and the actual file system. When you call VFS_read(file_node * n, .....) it calls the callback for the read function saved in the file_node structure.

But a single file system is usually usable on many different devices, like the same file system on a USB-stick and on a floppy, right?

So then I would need a third layer of abstraction. Something which reads raw blocks of data which the second layer of abstraction (file system specific) then parses. The tutorial I read didn't indicate that this third layer was needed, so that's why I'm asking.

How can I implement this in a nice way? Should every file node also have a callback to functions for reading raw data? How did you do it?

Thanks in advance!

Posted: **Thu Mar 01, 2012 10:43 am**

For my OS there are four entities:
- disk driver (eg harddisk, cdrom, ramdisk, etc)
- fs driver (FAT, EXT2, etc)
- the disk (in unix terms, sda, sdb), which has a disk driver associate with it
- the volumes (sda1, ..., or the drive letters), which has a disk and fs driver associate with it

Then the VFS maintain a mount table of which path corresponds to each volume.
say, /vol/initrd maps to a ram disk device's first volume.
when access /vol/initrd/abc the vfs pass /abc to that volume

Posted: **Thu Mar 01, 2012 11:18 am**

I have a similar setup: at the lowest level is the device driver which reads ranges of blocks from a device. filesystem drivers sit atop these, requesting and writing raw blocks to those devices. on top of that is a vfs where each node for the entire tree is cached to ram the first time it is requested, and each node contains a pointer to the responsible driver. the responsible fs driver maintains a list of function pointers that include getdir (read dir entries) readfile (read blocks from a file) writefile(write/append blocks to a file) and delfile. note that the vfs still operates with fixed size blocks, stream based I/o is a further abstraction on top of vfs within the libc functions read() and write().

Posted: **Thu Mar 01, 2012 11:47 am**

My VFS works like this:

* Structure of all device drivers.
* Structure of all "mounted"/found devices with ID and pointer to device-driver + FS-ID.
* Structure of all filesystem drivers with ID.

When a device driver is added in struct 1, I run an init() function, and if any supported device(s) are found it will be added to struct 2. To identify FS, I loop though the FS struct(3) and run a init() thing that will add FS-ID to struct 2 if the FS was found on the device or false for trying the next filesystem driver.

When reading a file with the VFS, it will use the filesystem driver with device ID as argument - so inside the filesystem driver the VFS can again be called to do device specific reading based on that ID. The device ID will be parsed by the VFS from the file path provided.

Posted: **Thu Mar 01, 2012 12:02 pm**

Thanks for your replies, it's really fantastic how quickly you can get help on this forum.
I will do some coding now, thanks again.

Posted: **Thu Mar 01, 2012 2:24 pm**

Bubach, what part of your layout handles partitions? Is this part of each device driver, appearing as multiple virtual block devices, and how do you prevent duplicated code there? Personally I haven't properly dealt with this yet but I intend to make a separate module to handle it that sits between device and fs.

Posted: **Thu Mar 01, 2012 2:56 pm**

Mine's similar. When each driver starts it registers each disk that it finds with the FS. The FS creates a volume_t for each disk and populates the /dev directory with each one. (/dev/sda, /dev/sdb etc.)

When volumes are partitioned a volume_t is also created for each partition. These also appear in /dev. Each volume has a cache. (/dev/sda1, /dev/sda2 etc.)

Volumes can be mounted. This creates a filesystem_t. Could be EXT3, FAT etc.

Then there's inode_t for each file on the disk and fcb_t for each open file.

There is a little issue with using the same volume_t structure for whole disks and partitions in that they overlap. For example /dev/sda1 is enclosed within /dev/sda. If someone does a read() of /dev/sda the FS converts it to a read of /dev/sda1. This keeps the caches in order.

Posted: **Thu Mar 01, 2012 3:34 pm**

brain wrote:Bubach, what part of your layout handles partitions? Is this part of each device driver, appearing as multiple virtual block devices, and how do you prevent duplicated code there? Personally I haven't properly dealt with this yet but I intend to make a separate module to handle it that sits between device and fs.

Yes for partitions it's up to the device driver to handle that. Not sure what you mean with code duplication? The FS driver will not have to provide anything but a (virtual) drive ID and the device driver will have to keep track that it's actually a partition.

At least that's how I picture it at the moment, but we'll see once I get to harddrive support if I've overlooked anything.

Posted: **Thu Mar 01, 2012 3:48 pm**

I use a slightly different approach. I have 3 layers: block device (reading/writing sectors), fs layer (fat,iso9660 etc, readdir, fstat etc.) and vfs (ram fs, mounts). In vfs I allow two different entries with the same name in the same directory, but one must be ended in separator. This saves me loop devices and gives the ability to handle partitions as directories. It's easier to understand with an example:
/dev/ide0 - block device for ide primary master
/dev/ide0/ - mounted disk, fs driver is "gpt fs"
/dev/ide0/EFI System Partition - block device for the first partition
/dev/ide0/EFI System Partition/ - mounted disk, fs driver is "vfat"
/dev/ide0/EFI System Partition/cdimages/something.iso - a file
/dev/ide0/EFI System Partition/cdimages/something.iso/ - loopback mounted iso9660 fs
This can result in very long paths, so I take use of massive amount of generated symlinks, for example:
/dev/disk0 -> /dev/ram0
/dev/disk1 -> /dev/ide0/EFI System Partition
/dev/disk2 -> /dev/ide0/My Data
/dev/disk3 -> /dev/ide0/My Data/somedir/floppy.img
etc. which in turn each has the same directory separator suffix hack:
/dev/disk0 - block device
/dev/disk0/ - mounted fs

Advantage: very clear, self-explanatory directory structure
Disadvantage: long paths, and I'm not happy with the current implementation: symlinks eat up way to much inode space. I'm not quite sure I won't drop the whole concept out on a bad day because of this. Also thinking of dividing /dev into two separate mount points (/dev and /devices) like Solaris, or using in memory only aliases.

Posted: **Sat Mar 03, 2012 9:18 am**

tobbebia wrote: I have just read a tutorial about virtual file systems and how it abstracts the actual file system. What I understand so far is that having callbacks to the actual file system functions in the file node structure seems like a good idea.

A tip that I noticed while working on the vfs is that the need to have a structure for each file/node in a filesystem only complicates things considerably even though this seem to be the traditional way to do it. What I mean is that instead of having something like this to represent a node in the filesystem:

Code: Select all

struct node {

  int inode;
  char *name;
  unsigned int (*open)(struct node *node, ...);
  unsigned int (*close)(struct node *node, ...);
  unsigned int (*read)(struct node *node, ...);
  unsigned int (*write)(struct node *node, ...);

};

You can get rid of that totally and instead let the filesystem itself hold the open/close/read/write functionality that only takes a fid as an argument and that fid does not have to represent a node structure but just an unique identifier for that filesystem. It made it very easy to implement something like a syntetic filesystem (like /proc or /sys) with minimal fuzz.

Posted: **Sat Mar 03, 2012 9:23 am**

IMO the traditional way is indeed to put the functions in the fs entity. The file node reference to the fs by ID or pointers.
I hope nobody on earth would put the functions within file node, it wastes memory when you have thousands of opened files.

Posted: **Sat Mar 03, 2012 9:40 am**

I also thought it was very weird. Even james molloy have that in his tutorial.

http://www.jamesmolloy.co.uk/tutorial_h ... nitrd.html

Posted: **Sat Mar 03, 2012 9:52 am**

Tutorials simplify things and usually don't care about performance or demonstrate elegant designs. You read the tutorials and learnt how it works, and you can think your better(maybe?) own way to do it. In such sense the tutorial did a great work.

Posted: **Sat Mar 03, 2012 10:27 am**

I'm still stuck on this.

Before I start I want to be sure that my design allows for multiple devices and file systems. But I'm still not sure how I can get this to fit together.

Right now I have two structures, fs_dev and fs_node. Every fs_node has an integer which contains the device ID. The device structure contains pointers to the necessary file system functions and device driver functions.

To make it more clear:

Code: Select all

typedef struct fs_dev
{
	unsigned int id;
	
	// File system functions
	unsigned int (* read)(fs_node_t * node, fs_dev_t * dev, unsigned int offset, unsigned int size, char * buffer);
	unsigned int (* write)(fs_node_t * node, fs_dev_t * dev, unsigned int offset, unsigned int size, char * buffer);
	fs_node_t * (* readdir)(fs_node_t * node, fs_dev_t * dev, unsigned int index);
	fs_node_t * (* finddir)(fs_node_t * node, fs_dev_t * dev, char * name);

	// Device driver functions
	unsigned int (* dev_read)(fs_dev_t * dev, unsigned int offset, unsigned int size, char * buffer);
	unsigned int (* dev_write)(fs_dev_T * dev, unsigned int offset, unsigned int size, char * buffer);

} fs_dev_t;

typedef struct fs_node
{
	char name[FS_NODE_NAME_LENGTH];

	// File attributes
	unsigned int type;
	unsigned int permissions;
	unsigned int size;
	
	// Which device the node belongs to
	unsigned int dev_id;

	// Used if this node is a mount/symlink
	fs_node_t * ptr;
} fs_node_t;

So the plan was to keep all the devices in a linked list and when one of the asbtract functions were called like vfs_write it would first find the device with the correct id (which was obtained from the fs_node passed to the function) and then pass the fs_dev structure to the file system functions. The file system functions the use the device driver functions found in the fs_dev structure to write and read raw data to the device.

Let's say that I have a HDD with file system FAT connected to the computer. I want to read from a file on the HDD. I would like every device to be mounted in a virtual file system at /dev. How do I create this virtual file system? Is it with an initrd file?
So I create a fs_node with the name /dev/hdd0. Should I write this file to the virtual file system so that it later can be found when listing the content of /dev? In James Molloy's tutorial he says something like "who would want to write files to a initrd fs", so he doesn't include the ability to write files to his initrd file system format.
I then call vfs_finddir(root_node, "/dev/hdd0/myFile.txt") to get a node structure for myFile.txt. I then pass that structure to vfs_read; vfs_read calls fat_read, fat_read uses the device driver functions (hdd_read) to search through the whole HDD until it finds the offset to the file I wanted to read from (this part i'm unsure about, imo searching through to find it would take a long long time, but how can i cache it when multiple devices might use the same function because of having the same file system).

Questions extracted from the text above:

How should the root file system be created, is this what initrd is for?
When I mount something, should I create a new node in the root file system? Then my initrd format will have the capability to write files, why doesn't James Molloy have this in his?
Should I really parse the whole device (e.g. HDD) to find a file every time I want to read or write from it? How can i keep cache for multiple devices that use the same fs?

Is this a good approach at handling devices, file systems and files?

Thanks for your expertise.

Posted: **Sat Mar 03, 2012 2:01 pm**

Don't mix volumes with mount table, they can be isolated.

Think about this, you have a list of volume objects, which can do this thing:
FILE_NODE file;
volume->open(&file, "/abc.txt");

Now, the mount mechanism is merely hash mapping.

Code: Select all

vfs_mount_add("/", vol[0]);
vfs_mount_add("/foo/", vol[1]);

When open a file

Code: Select all

vfs_open(/foo/abc.txt");

This resolve into vol[1] and pass "/abc.txt" to it.

Should I really parse the whole device (e.g. HDD) to find a file every time I want to read or write from it?

To speed thing up you may store the volume reference, as well as other stuff that calculated upon open, in the FILE_NODE, DIR_NODE, etc.

OSDev.org

Questions about file handling

Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling

Re: Questions about file handling