Managing devices and filesystems

Agola · Post by **Agola** » Fri Jun 16, 2017 5:06 am

Hi,

I'm using a device_t structure that have read and write functions, takes size and offset in bytes and a pointer to buffer.
Each filesystem_t object registered in VFS have a device_t pointer. Filesystem calls device->read or device->write when it needs to read / write to device.

But things started to get confusing when I have multiple partitions on same device and multiple devices with same read function (ata_read or ata_write takes which device to read as uint8_t).

How should I manage devices, partitions, volumes and their filesystems? I'm out of ideas.

Thanks in advance.

mallard · Post by **mallard** » Fri Jun 16, 2017 6:23 am

Have "device_t" include a "void*" which points to device-specific data that allows read/write/etc. to select the correct device, then pass this as an extra parameter to the functions.

Make partitions separate virtual devices. The extra "void*" on them would point to a structure that would contain a pointer to the "device_t" for the underlying hardware, as well as the partition details.

SpyderTL · Post by **SpyderTL** » Fri Jun 16, 2017 6:28 am

I use nested objects to handle this type of situation.

My file read() function calls my file system read() function, which calls my partition read() function (if I had one...), which would call my driver read() function.

If you are clever, and all of your read functions take identical parameters, then you can swap out any of these components with other layers, like different drivers, or a RAM disk, or a network storage device, or even a microphone.

Reading bytes as a stream (or filling a data buffer with bytes) is pretty universal, as long as you move everything else that is device specific out of your read function.

Linux does this by using one read function to read bytes, and a separate generic ioctl function for everything else.

simeonz · Post by **simeonz** » Fri Jun 16, 2017 7:02 am

The addressing scheme and the hardware are concerns for the driver. For performance reasons, the connection between a file system object and a device object, or the hierarchy between device objects may be exposed in order to regulate the I/O queue depth accumulating for devices during system activities such as flushing. Also, some information about the underlying technology - like reliability, availability, access granularity, endurance, etc., may be profitably exposed.

Other than that the methods could be passed pointer to some canonical structure for the driver class, but the structure can be embedded in driver specific or device specific outer structure, thus enabling the driver to store additional data (like network location, physical interface or layout information.) Alternatively, the canonical structure may have a field for pointer to private data. Either way works, the latter approach being more flexible at the cost of one additional allocation.

It was already noted that filesystems may be network attached or pseudo, thus without OS controlled backing device. The device may be RAM. It may be a volume produced by striping, spanning, allocating from storage pool, etc. It may iSCSI. So, the description of a device or file system can not be standardized effectively, thus the opaqueness.

Agola · Post by **Agola** » Fri Jun 16, 2017 12:43 pm

SpyderTL wrote:I use nested objects to handle this type of situation.

My file read() function calls my file system read() function, which calls my partition read() function (if I had one...), which would call my driver read() function.

If you are clever, and all of your read functions take identical parameters, then you can swap out any of these components with other layers, like different drivers, or a RAM disk, or a network storage device, or even a microphone.

Reading bytes as a stream (or filling a data buffer with bytes) is pretty universal, as long as you move everything else that is device specific out of your read function.

Linux does this by using one read function to read bytes, and a separate generic ioctl function for everything else.

That makes sense. But what happens if my driver takes device number as parameter, like in my ata driver?
For example file_read() calls fat32_read(), fat32_read_cluster() calls partition_read() and partition_read() calls ata_read() that needs device number also.

How can I get rid of these device specific things?

Thanks in advance.

Brendan · Post by **Brendan** » Fri Jun 16, 2017 7:03 pm

Hi,

Agola wrote:I'm using a device_t structure that have read and write functions, takes size and offset in bytes and a pointer to buffer.
Each filesystem_t object registered in VFS have a device_t pointer. Filesystem calls device->read or device->write when it needs to read / write to device.

But things started to get confusing when I have multiple partitions on same device and multiple devices with same read function (ata_read or ata_write takes which device to read as uint8_t).

How should I manage devices, partitions, volumes and their filesystems? I'm out of ideas.

The first thing I'd do is separate "devices" from "file systems". They are completely different things that shouldn't be conflated.

Mostly, (for storage devices) their device drivers would create some sort of "pseudo-files" (e.g. "/dev/sda", "/dev/sda1", "/dev/sda2"); and file systems would mount a file, possibly without caring if it's mounting a device driver's "pseudo-file" or mounting a normal file provided by a different file system (e.g. there doesn't need to be a reason why your ISO9660 file system code can't mount "/usr/home/Agola/my_CD_image.iso").

Also note that this can also mean that some kinds of files (e.g. "*.zip" and "*.tar") can be supported as file systems, which could be very convenient if combined with an "auto-mount in place" feature (e.g. a user could download "foo.zip" and then do "ls foo.zip" and "cat foo.zip/bar.txt" without any hassles).

Cheers,

Brendan

simeonz · Post by **simeonz** » Sat Jun 17, 2017 1:01 am

simeonz wrote:For example file_read() calls fat32_read(), fat32_read_cluster() calls partition_read() and partition_read() calls ata_read() that needs device number also.

I propose that you don't work with device numbers at all. Those are intended for userland and can be supported as such by lookup tables. Each kernel object in the storage stack should directly reference the objects underneath it (the device object for the enclosing storage space) and store any additional parameters that describe its layout.

If the underlying storage happens to be some partition, it will behave just like an unpartitioned device as far as the filesystem is concerned. The partition object will keep a field referencing the enclosing disk device and its first and last sectors. But the partition device and the disk device are essentially the same abstraction - contiguous sector storage. The device references and other parameters are established when the relevant kernel objects are created - when the file system is mounted (automatically or manually), when the partitions are enumerated, when the storage interface enumerates the attached devices, etc. If some storage device happens to be a partition - fine, it will reference the enclosing storage. If the storage device is a RAM disk, then it will keep pointer to a memory allocation. For a file-backed storage device (loop device), a reference to a file kernel object will be kept.

Similarly, the file system gets description of some kind of storage when it mounts, whose nature depends on your OS design and the file system (could be network path, could be local device file, could be even storage interface path.) Then, when mounting, this device description is converted to the appropriate kernel object pointer - storage device object for local file system, tcp connection object (like socket) for network file system, pointer to buffer in RAM for tmpfs like file system, etc.

Where are those references and parameters stored? Depends on your OS design choices - either private (i.e. driver-implementation specific) structure or public (i.e. driver-class standardized) structure. For things like tcp connections or RAM buffer pointers, a private field would be the norm, as those are rather specific and also generally uninteresting for most system facilities, but the driver itself. Note that you don't need to pass two pointers around - either structure embedding or pointer to private structure in the public structure will do the trick.

Also, you should use reference counting to guarantee proper lifetime management of your kernel objects. For example, even if the disk is disconnected, the kernel object must remain alive to provide the file system and partition drivers with interface that denies their requests and makes sure that they are informed of the disk's new status.

irvanherz · Post by **irvanherz** » Sat Jun 17, 2017 10:12 pm

Agola wrote:I'm using a device_t structure that have read and write functions, takes size and offset in bytes and a pointer to buffer.
Each filesystem_t object registered in VFS have a device_t pointer. Filesystem calls device->read or device->write when it needs to read / write to device.

But things started to get confusing when I have multiple partitions on same device and multiple devices with same read function (ata_read or ata_write takes which device to read as uint8_t).

How should I manage devices, partitions, volumes and their filesystems? I'm out of ideas.

Thanks in advance.

In UNIX, everything is a file. So, every file must have an inode / vnode (http://www.cs.fsu.edu/~awang/courses/co ... /vnode.pdf).

I think, it will be better to see disks and partitions as a file too. So, all device drivers will create a inode object if they provides read/write service.

For example, whenever a file do read(), filesystem driver process it, then pass arguments to the read() of partition's inode. Partition's inode process arguments then pass them to the read() of disk's inode. So, for ease, each inode has a field that point to lower level inode owner.

For clear;
1. A IDE driver scanning hardware then create inode for /dev/storage/hdX
2. A partition driver scanning inode in /dev/storage/, then generate partition's inode as /dev/partitions/hdX-n
3. Filesystem driver mount /dev/partitions/hdX-n partition, then create its root inode.

For summary;
- IDE driver create inode for /dev/storage/hdX. Lower device: null
- Partition driver create /dev/partitions/hdX-n. All of these inode has lower device: /dev/storage/hdX
- Filesystem driver mount /dev/partitions/hdA-0 to /mnt/system. So, inode of /mnt/system will point its lower device: /dev/partitions/hdA-0

OSDev.org

Managing devices and filesystems

Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems

Re: Managing devices and filesystems