Page 1 of 1

FS Drivers: How to integrate stock filesystems in exokernels

Posted: Mon Sep 13, 2010 3:39 am
by Combuster
I'm stuck with a design problem: In traditional exokernel design, it is customary that an application has access to the locations (in sectors) of files, and can read or write individual sectors accordingly. More importantly, it should be able to say that file Y should follow directly after X, so that I can read them both without having to perform a seek.

I've checked out the idea behind MIT's exokernel, and it just assigns stretches of disk blocks to an user library, which then manage their own little filesystem at their own discretion. This means that two distinct applications need to have access to the same library to be able to use data from each other. Based on that, the filesystem should be a preset like it is on any other system, and all applications should be able to work with whatever filesystem is present.

This lead to the following initial design:
  • The FS driver controls which applications get access to what sectors
  • Applications can ask the FS for a file, and it will return (part of) a blocklist, and tell the disk driver to give the process the relevant permissions
  • Applications can ask the FS for free blocks, and can ask the FS to create a file using a provided blocklist.
This would work on any FS that supports fragmentation and does not perform journaling. That's exactly where the problems start:
  • In SFS, files need to be contiguous on disk, and the FS driver needs to defragment the disk the moment it does not have space for the extent needed, potentially invalidating all blocklists
  • Similarly, defragmentating a FAT system leads to the same problem.
  • Data journaling is not possible: An application would potentially overwrite parts of files and leave the rest intact. Redirecting writes elsewhere breaks an application's assumption that files are stored consecutively on disk, and can cause fragmentation.
  • The above problem probably manifests at its peak horror with versioning filesystems.
Do any of you have a solution to the above problems without requiring excess messaging overhead, lock contention, or tons of if(property)'s on the client side to make things work together efficiently? Ideas or partial solutions are welcome as well.

Thanks in advance

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 4:01 am
by Solar
Wouldn't it be up to the libOS to handle any necessary abstraction? That the exokernel allows the libOS access at the block level doesn't mean the libOS should pass this access on to the application, does it?

The freedom with an exokernel is that you can have a libOS handling blocks x-y this way, and another libOS handling blocks y-z that way, and allowing an application to bring its very own libOS into the arena. But as far as I understood the concept, to have any cooperation between applications, they have to share the same libOS, or at least compatible libOS's.

(I dropped the idea of an exokernel for me, personally. Too many headaches. 8) )

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 4:09 am
by Combuster
The reason I'm not doing it that way is because:
- An application with a specific libos won't be able to read f.x. a SD card when FAT16 is not part of the library.
- An application could either spoof a libos and reinterpret the security bits, or all ACL-related features should be part of a host FS, which leads back to the problem that only one FS can exist per medium.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 5:51 am
by Solar
That's what you get for using an exokernel IMHO: Maximum customizability, maximum confusion.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 8:48 am
by Owen
IMO, the only way to realise a practical Exokernel system is to assume that, while there may be multiple operating systems running under it, there is one primary OS responsible for resource control; all others are slave to that. In fact, I'd go further and say that applications should run exclusively under the primary OS; secondary OS's are used either for (a) paravirtualization or (b) specialized device drivers.

In fact, I'd go further and say that I have only seen two practical applications of exokernels/similar systems:
  • Symbian/EKA2: EKA2 is a realtime "nanokernel"; Symbian is a non-realtime OS built on top of it. EKA2 is used to allow the GSM/UMTS/LTE signalling stack to share the processor with the Symbian platform; in this case, the signalling stack is effectively a glorified driver
  • The aforementioned virtualization and paravirtualization; particularly where one of the VMs is responsible for arbitrating hardware access for the others (e.g. Xen style Dom0/DomUs)
My personal opinion is that, for FS access, everything should go through the file system. This shouldn't impose much overhead either: App calls FS and gives it page(s) to scatter data into; FS calls disk driver and gives it page(s) and block list; disk driver reads data and then returns notification (and if you're good, you should be able to make the notification go directly back to the app). This seems to me to be especially necessary in the case of stacks; for example, EXT2 in loopback device on ReiserFS on LVM on Soft RAID-5 on 3 physical disks.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 10:27 am
by Combuster
Come on guys, this is not about religion or bad practices, this is about showing that things can be done. And apparently I should have said that earlier. :(

Anyway:

I want a design for a filesystem driver that works as an server in a microkernel environment, such that I can
- access files per block and manually schedule (within FS limits) when something is read or written.
- have filesystem independence. Should operate nicely with established FSes, including but not limited to FAT, SFS, XFS, ISO 9660, UDF, Reiser, JFS, ZFS, Btrfs and *FS.
- have permissions to the level allowed by the filesystem.
- minimize the amount of communication.

Any takers? (or people trying)

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 2:05 pm
by bewing
Well, honestly, it seems like you have a true conundrum here.

Leaving aside WORM-type devices, any rewritable medium in any reasonable filesystem in any reasonable theoretical OS is going to need some kind of defragging/reorganization daemon. Just as an example, you must recopy sectors of a magnetic disk every few years, or the magnetic domains will deteriorate.

So, assuming the computer is on for an extended period of time: an application may grab some sectors, write a file to them, and then the daemon is going to come along eventually and move the sectors. This is a generic situation, and should be handled generically. It sounds to me like it is absolutely necessary for the LBA->file mapping to have a method for invalidation. Your only other hope is to add one layer of indirection -- which defeats the entire spirit of an exokernel, really.

So the only real question seems to me to be how the mapping gets tested for invalidation. Perhaps the FS manager changes the permission bits on open mappings that have been invalidated? Then, if the app actually tries to access the mapping again after it's been invalidated, it gets a "invalidated, please remap" error? Perhaps that error could even be handled automatically, without even informing the app that it happened?

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 2:18 pm
by Combuster
I was thinking along that line, and yes it can work. There's a problem though, how would you prevent starvation when some app is repeatedly writing a file which causes the mapping to be invalidated before some other app can try to read it?

There's a second ABA-problem (file A moves, file B moves into A's position, trying to read a sectors that would be A returns B), which can be solved with tagging but it isn't the cleanest either.

At least it is a step closer to the goal.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Mon Sep 13, 2010 11:47 pm
by quanganht
Owen's idea seems to be more suitable for this kernel design. Actually, it is how Xen VM works. Xen's Dom0 is a modified Linux kernel which acts as a master, and all other kernels are slaves. That way maybe u can control applications better.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Tue Sep 14, 2010 12:05 am
by Brendan
Hi,

There are basic rules. An OS can bend the rules a little, but the rules can't be avoided.

At any point in time; for all types of resources (I/O ports, IRQs, sectors, files, network connections, software interfaces, whatever), for all types of OS (exo-kernel, micro-kernel, monolithic, whatever):
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
For groups of resources of any type or any mixture of different types (for all types of OS):
  • the group of resources can be sub-divided, or
  • the group of resources can be used to create a different kind of resource
Initially (at least for 80x86) there's only 3 types of resources:
  • areas of the physical address space (pages)
  • areas of the I/O address space (I/O ports)
  • CPUs
All other types of resources are created (either directly or indirectly) from these initial resources.

So, an example:
  • "Physical Address Space Manager" is given exclusive access to all of physical memory, and sub-divides it.
  • "Memory Manager" is given all "usable RAM" pages by "Physical Address Space Manager", and sub-divides it.
  • "Device Manager" is given exclusive access to all I/O ports, and "Physical Address Space Manager" gives "Device Manager" exclusive access to all memory mapped I/O areas. "Device Manager" keeps some of these resources to itself (e.g. the I/O ports needed to access PCI configuration space) and subdivides the rest of the resources (giving different I/O ports and different memory mapped I/O areas to the corresponding device drivers).
  • "PIC Driver" is given exclusive access to a few I/O ports by "Device Manager", and uses them to create a new "IRQ" resource. "PIC Driver" gives exclusive access to all IRQs back to "Device Manager".
  • "I/O APIC Driver" is given exclusive access to a memory mapped I/O area by "Device Manager", and uses it to create a new "IRQ" resource. "I/O APIC Driver" gives exclusive access to all IRQs back to "Device Manager".
  • "SATA Controller Driver" is given exclusive access to a few I/O ports and an IRQ by "Device Manager"; and uses them to create a new "SATA channel" resource. It sub-divides the new SATA channel resource.
  • "SATA Disk Driver #1" is given exclusive use of a SATA channel by "SATA Controller Driver". It uses this resource to create a new type of "sectors" resource; and sub-divides this new resource.
  • "Database Task" is given exclusive use of 1 million sectors by "SATA Disk Driver #1". It uses these resources to create a new type of "SQL channel" resource. The (potentially unlimited) number of "SQL channel" resources are sub-divided.
  • "FAT File System" is given exclusive use of 1 million sectors by "SATA Disk Driver #1". It uses these resource to create a new type of "file" resource. The new "file" resources are sub-divided (some tasks are given exclusive read/write access to some files, some tasks are given shared read only access to other files, etc).
  • "SATA CD-ROM Driver #1" is given exclusive use of a SATA channel by "SATA Controller Driver". It uses this resource to create a new type of "sectors" resource; and sub-divides this new resource.
  • "ISO9660 File System" is given exclusive use of all of the sectors provided by "SATA CD-ROM Driver #1". It uses these resource to create a new type of "file" resource. The new "file" resources are sub-divided (different tasks are given shared read only access to files, etc).
  • "Unzip" is given exclusive use of a file by "FAT File System". It uses this resource to create a new type of "file" resource. The new "file" resources are sub-divided (some tasks are given exclusive read/write access to some files, some tasks are given shared read only access to other files, etc).

Cheers,

Brendan

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Tue Sep 14, 2010 2:30 am
by Combuster
Brendan wrote:
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
That is completely wrong. Shared memory disagrees, Windows' file management disagrees: both can have two separate entities share read/write access to the same block. And arguably the same holds for NICs where you occasionally want to be able to send any traffic from any app, and read all traffic coming in (think virtual machines)

I don't really see what the rest is supposed to help me with, being that such a hierarchical system already exists within my OS :?
Owen's idea seems to be more suitable for this kernel design. Actually, it is how Xen VM works. Xen's Dom0 is a modified Linux kernel which acts as a master, and all other kernels are slaves. That way maybe u can control applications better.
What part of "See if it can be done" needs elaboration? :(

My OS is not Xen. My OS is not a clone of a famous Exokernel. My OS does not virtualize many guest OSes. In fact, I want this to work independent of the kernel so I can plug the same code on f.x. my development platform to build whatever disk images I want.

And besides, Owen's design is one for inter-process communication that uses shared memory, it is NOT a driver interface at all.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Tue Sep 14, 2010 4:32 am
by Brendan
Hi,
Combuster wrote:
Brendan wrote:
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
That is completely wrong. Shared memory disagrees, Windows' file management disagrees: both can have two separate entities share read/write access to the same block. And arguably the same holds for NICs where you occasionally want to be able to send any traffic from any app, and read all traffic coming in (think virtual machines)
A better example of "an OS can bend the rules a little" would've been "append mode" file access, where one or more tasks can have read access to a file while another task has append access to the same file.

For shared memory and Window's file management, how do you guarantee that different pieces of code don't screw each other up? Do you have a reentrancy lock (where whoever holds the lock has exclusive access), or (for shared memory) do you have atomic pointer/s (maybe "head" and "tail" pointers) saying which tasks have exclusive access to which areas of the shared memory, or (for Window's file management) do you give a writer a virtual copy of the file (e.g. where writes are buffered and don't effect reads made by other tasks) so that all task's have read access to the original file and writers have exclusive access to the parts that they modified?

Mostly what I'm asking is "in which way are the rules bent a little".
Combuster wrote:I don't really see what the rest is supposed to help me with, being that such a hierarchical system already exists within my OS :?
Did you see any of your design problems in my example?
Combuster wrote:I'm stuck with a design problem: In traditional exokernel design, it is customary that an application has access to the locations (in sectors) of files, and can read or write individual sectors accordingly.
If an application has been granted access to a file (one type of resource), then it has not been given access to any raw sectors (which are a completely different type of resource). I have no idea why you made the mistake of thinking "it's customary", or how you could forget about things like NFS.


Cheers,

Brendan

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Tue Sep 21, 2010 12:05 pm
by quartsize
Hello,
Combuster wrote:I've checked out the idea behind MIT's exokernel, and it just assigns stretches of disk blocks to an user library, which then manage their own little filesystem at their own discretion.
I believe XN is a little more complicated than that. If I remember correctly, through the use of UDFs, XN knows enough about the filesystem metadata format to make access control decisions, and therefore can multiplex a filesystem among several libOSes.

It really just goes to show that you have to jump through a lot of hoops to achieve exokernel-level flexibility.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Fri Sep 24, 2010 1:40 pm
by rdos
I don't think there is a need for a "Device Manager" which handles all kind of resources. It works just as well to simply let different drivers use it's allocated IOs (either hard-coded, through PCI or something else). IRQs will be handled by a IRQ handler, and it can export functions to either share an IRQ or request exclusive access to it. When it comes to protecting data structures, I use a simple "section" primitive defined in the task manager. A device that needs to protect its data structures, will simply create a section and use enter/leave when it wants to protect it. IOW, I use a non-hierarchial, local approach to resources and resource protection.

Re: FS Drivers: How to integrate stock filesystems in exoker

Posted: Tue Oct 26, 2010 12:50 pm
by Combuster
Thanks to everyone who tried to help, including those testing my stubbornness :wink:
bewing wrote:So the only real question seems to me to be how the mapping gets tested for invalidation. Perhaps the FS manager changes the permission bits on open mappings that have been invalidated? Then, if the app actually tries to access the mapping again after it's been invalidated, it gets a "invalidated, please remap" error? Perhaps that error could even be handled automatically, without even informing the app that it happened?
I came up with the following design. It's not perfect, but at least it leaves lock ownership at the trusted side. Essentially, what I plan on doing is putting an MMU on top of the disk, then let the FS driver manage the virtual spaces (of which there would be one per file). Then if the filesystem needs to change something, it can temporarily put a lock over the space's permissions, causing a stall on the least privileged side. At any time, the process can look up the exact mappings from this MMU, and it can subscribe to notifications to be sent when its projection of the disk changes. Negotiation of the exact sectors happens between the application and the FS server.

By making the MMU recursive, file (capability) sharing and loopback devices can be implemented without adding a level of indirection.

Comments?