Designs of microkernels

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Designs of microkernels

Post by bzt »

AndrewAPrice wrote:I was thinking about garbage collecting code - what about static and global variables? (Do they reset upon being unreachable?) Maybe we forbid them (in our custom language), and have a root 'System' object which devices, the vfs, window manager, etc, live under that we pass into a program's main?
I think having a root 'System' object is a good idea. Besides of globals, it's methods could be like libc's functions, standardized and part of the language, something like a run-time library.
thewrongchristian wrote:Or port a JVM. Perhaps we could call it JavaOS :)

Joking aside, JavaOS on modern HW, with modern RAM levels and JIT would probably work pretty well.
This isn't as far fetched as first seems :-) But I would choose WASM. The custom language could compile to WASM bytecode easily, and there are tons of easy to port WASMVMs already.

Cheers,
bzt
thewrongchristian
Member
Member
Posts: 424
Joined: Tue Apr 03, 2018 2:44 am

Re: Designs of microkernels

Post by thewrongchristian »

bzt wrote:
thewrongchristian wrote:Or port a JVM. Perhaps we could call it JavaOS :)

Joking aside, JavaOS on modern HW, with modern RAM levels and JIT would probably work pretty well.
This isn't as far fetched as first seems :-) But I would choose WASM. The custom language could compile to WASM bytecode easily, and there are tons of easy to port WASMVMs already.

Cheers,
bzt
I've just 'borrowed' Inside the JavaOS operating system, and actually just bought a copy off ebay. Reading some of the implementation details was like skimming this thread.

The microkernel runs everything in supervisor mode, with a single address space and has no explicit IPC syscalls, instead passing stuff back and forth using pointers.

Good call on the WASM option. JNI is enough to put me off using a JVM, but I must admit I am a bit of a Java fanboy, and my C runtime in my kernel is quite heavily influenced by Java.

Do you have any recommendations on WASM VM to use?
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Re: Designs of microkernels

Post by moonchild »

AndrewAPrice wrote:I was thinking about garbage collecting code - what about static and global variables? (Do they reset upon being unreachable?) Maybe we forbid them (in our custom language), and have a root 'System' object which devices, the vfs, window manager, etc, live under that we pass into a program's main?
Reachability is usually defined in terms of a root. I can't think of a reasonable scheme that doesn't involve a root.

From there, you can have a global (accessible directly from the root) 'proc' object, which is an array of processes. Not dissimilar to /proc. Within every process object are stored that process's static variables. It's easy to take references to those variables, just the same as you would take a reference to any other shared variable, and cleanup is handled in the exact same way.

Probably you want tracing gc for more flexibility in your object representations, but that's ultimately somewhat incidental.
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Re: Designs of microkernels

Post by moonchild »

Regarding language/runtime choice——

I think the biggest problem with the jvm is not the awfulness of the jni; but the fact that it's a truly massive hunk of code and it has to live in your kernel. You don't want that. For so many reasons; maintainability and security come to mind. (It somewhat defeats the purpose of having a microkernel...)

It's difficult to beat the jvm's performance, and popular wasm implementations won't take you anywhere close for quite a while. Firefox/chrome wasm implementations will be quite performant, but just as hard to work with as hotspot. Wasm is also missing several important pieces at the moment, like simd and gc; but these will (hopefully) come in the future.

At the intersection of small, performant, and safe, there not many options. Luajit is astonishingly performant, but with caveats. In particular, luajit depends on minimizing the scope of tables and identifiers in order to avoid having to perform runtime lookups. When tables are global and can be accessed from anywhere, these optimizations go to pot; you need real static data structures. So luajit probably wouldn't work very well.

There are also the apls and the forths. I'm not going to say that either of these is a nonstarter, because I'm a fan of apl (and just don't know forth well enough to make a judgement about it) and intend to build an apl interpreter into my os; but both have ... narrow appeal.
User avatar
AndrewAPrice
Member
Member
Posts: 2299
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Designs of microkernels

Post by AndrewAPrice »

moonchild wrote:
AndrewAPrice wrote:I was thinking about garbage collecting code - what about static and global variables? (Do they reset upon being unreachable?) Maybe we forbid them (in our custom language), and have a root 'System' object which devices, the vfs, window manager, etc, live under that we pass into a program's main?
Reachability is usually defined in terms of a root. I can't think of a reasonable scheme that doesn't involve a root.
I wasn't thinking about 'root' in terms of garbage collection, but how in a system with no static/global variables you couldn't call:

Code: Select all

string contents = FileSystem::ReadFile("test.txt");
How would you implement this static function if you had no way of getting an instance to the VFS?

So instead, in our theoretical language with no static/global variables, the launcher would pass in a 'System' object to the main() function, something like this:

Code: Select all

void main(System* system) {
  string contents = system->FileSystem()->ReadFile("test.txt");
}
My OS is Perception.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Designs of microkernels

Post by bzt »

thewrongchristian wrote:Do you have any recommendations on WASM VM to use?
The wiki has some links. There's the "official" libwasmint, but somehow I do not feel comfortable about it (don't ask why, just a hunch). For an interpreter, wasm3 looks good. I'm not fond of JIT because they are a huge security risk, but for a JIT I'd recommend porting wasmjit.
AndrewAPrice wrote:How would you implement this static function if you had no way of getting an instance to the VFS?
Totally different kind of project, but I had exactly the same problem here. There for example PHPPE\Core::validate() had to access the instance without the caller knowing, so I've used a singleton instance self reference in a static property PHPPE\Core::$core as a workaround.

So FileSystem::ReadFile() under the hood could use FileSystem::vfs private property to get the instance. Using methods like system->FileSystem()->ReadFile() is
- let's face it ugly as hell,
- makes code read harder,
- could trigger an NULL dereference (not checked if system->FileSystem() actually returns anything, yet it dereferences that unsafe pointer)
- and results in a non-effective compiled code (you can forget about optimizations, because non-static method calls are always resolved in run-time to allow polymorphism).
On the other hand, a private property instance would be
- I think it's simpler and more readable, I prefer FileSystem::ReadFile()
- no NULL dereference hazard on caller side (ReadFile can easily check ::vfs before it uses it inside the method)
- no problem with compiled code optimization
Just my two cents.

Cheers,
bzt
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Re: Designs of microkernels

Post by moonchild »

bzt wrote:I'm not fond of JIT because they are a huge security risk
There are formally verified JITs. I linked one already; here's another more mature one.
User avatar
AndrewAPrice
Member
Member
Posts: 2299
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Designs of microkernels

Post by AndrewAPrice »

bzt wrote:[So FileSystem::ReadFile() under the hood could use FileSystem::vfs private property to get the instance.
If the theoretical language doesn't have static or global variables (because garbage collecting code and reloading it later would reset said variables), how would you implement this private property without an instance of FileSystem passed in?
My OS is Perception.
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Re: Designs of microkernels

Post by moonchild »

AndrewAPrice wrote:If the theoretical language doesn't have static or global variables (because garbage collecting code and reloading it later would reset said variables), how would you implement this private property without an instance of FileSystem passed in?
Make global variables globally reachable, so the gc process doesn't wipe them out.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Designs of microkernels

Post by bzt »

moonchild wrote:There are formally verified JITs. I linked one already; here's another more mature one.
Verifying the code is important, but that's not the issue. JIT compilation requires that userspace code must be able to write memory which later will be executed as code. This opens up the possibility to inject code with buffer overflow attacks (unrelated to JIT), essentially putting you back in the era before NX bit got widespread. On ARM for example, you can globally configure the MMU to only allow executing a memory which is not writable (called W^X, referring to "exclusive or"). In order to implement JIT, you must turn this security feature globally off.

With interpreters, you can make sure of it that all code is read-only and all data is non-executable for ring 3 tasks, making it impossible inject code with buffer overflow (unless the buffer overflow is in ring 0 which still can write code into memory). A possible solution could be a JIT compiler running in ring 0 and then executing the generated code in ring 3, but I haven't seen such a solution (yet).
AndrewAPrice wrote:how would you implement this private property without an instance of FileSystem passed in?
You misunderstood, the whole point is, there should be no need to pass the FileSystem instance during calls. One time in the code, during initialization, you create an instance and save that in a static private property. Then later, in the static method calls, they can get that instance without the caller knowing anything about it.

Code: Select all

// okay, stupid example, but I hope you get what I mean
FileSystem::init()
{
    FileSystem::vfs = new VFS();
    // ... other things to do during init
}

// just an example method for accessing it
FileSystem::ReadFile(fn)
{
   if (FileSystem::vfs == NULL) return ERROR;
   FILE *f = FileSystem::vfs->Open(fn);
   buffer buf = FileSystem::vfs->Read(f);
   FileSystem::vfs->Close(f);
   return buf;
}
Not a valid code, just to demonstrate private property usage. NOTE: instance stored in a property, so it's not global, yet GC should not release it while FileSystem class is accessable. When you unload the FileSystem code from memory, then it will be freed along with the other FileSystem properties.

In my example OOP code that I linked, the property is set in the main class' constructor like "self::$core = &$this;" (I instantiate PHPPE\Core class only once). Then the static methods can access this instance anytime if they need it (for example in PHPPE\Core::lib()).

I'd like to point out that this trick only works for 'root' objects, which are singletons (such as the VFS). For classes with multiple objects (like mount points of the same file system type), you should use separate instances as usual (and maybe store them in an array in a VFS property). Also there's a need for language support to allow instantiating static classes and making static method calls of that same class. Have no mistake, this is a dirty hack so that caller don't have to pass instance references around :-)

Cheers,
bzt
Last edited by bzt on Thu Jan 14, 2021 4:32 pm, edited 1 time in total.
Octocontrabass
Member
Member
Posts: 5512
Joined: Mon Mar 25, 2013 7:01 pm

Re: Designs of microkernels

Post by Octocontrabass »

bzt wrote:On ARM for example, you can globally configure the MMU to only allow executing a memory which is not writable (called W^X, referring to "exclusive or"). In order to implement JIT, you must turn this security feature globally off.
I don't see any reason why you would need to globally disable it. You can have the compiler mark pages read-only before executing them, or share the pages between two address spaces where the compiler has write access but the thread executing the program has read-only access.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Designs of microkernels

Post by bzt »

Octocontrabass wrote:I don't see any reason why you would need to globally disable it. You can have the compiler mark pages read-only before executing them, or share the pages between two address spaces where the compiler has write access but the thread executing the program has read-only access.
In theory true, but in practice it is still difficult to do correctly. Not to mention that in practice it is often impossible to use two address spaces (most applications want to run JIT code as threads, which leaves us with the first option). I guess this is the main reason why in Linux mmap calls the MAP_DENYWRITE and MAP_EXECUTABLE flags are simply ignored. But with a custom kernel that can remap pages read-only, this is doable, it would provide a similar separation like the compiler in ring 0 but executed in ring 3 solution I proposed, so it would work (POSIX says that setting both PROT_EXEC and PROT_WRITE with mprotect is implementation specific).

Cheers,
bzt
Octocontrabass
Member
Member
Posts: 5512
Joined: Mon Mar 25, 2013 7:01 pm

Re: Designs of microkernels

Post by Octocontrabass »

bzt wrote:Not to mention that in practice it is often impossible to use two address spaces (most applications want to run JIT code as threads, which leaves us with the first option).
Why would running the JIT code as threads stop you from putting those threads in a separate address space from the compiler?
bzt wrote:I guess this is the main reason why in Linux mmap calls the MAP_DENYWRITE and MAP_EXECUTABLE flags are simply ignored.
Aren't those already covered by PROT_WRITE and PROT_EXEC?
moonchild
Member
Member
Posts: 73
Joined: Wed Apr 01, 2020 4:59 pm
Libera.chat IRC: moon-child

Re: Designs of microkernels

Post by moonchild »

bzt wrote:Verifying the code is important, but that's not the issue. JIT compilation requires that userspace code must be able to write memory which later will be executed as code. This opens up the possibility to inject code with buffer overflow attacks (unrelated to JIT), essentially putting you back in the era before NX bit got widespread. On ARM for example, you can globally configure the MMU to only allow executing a memory which is not writable (called W^X, referring to "exclusive or"). In order to implement JIT, you must turn this security feature globally off.

With interpreters, you can make sure of it that all code is read-only and all data is non-executable for ring 3 tasks, making it impossible inject code with buffer overflow (unless the buffer overflow is in ring 0 which still can write code into memory). A possible solution could be a JIT compiler running in ring 0 and then executing the generated code in ring 3, but I haven't seen such a solution (yet).
If you implement a safe language with the JIT, you can't have buffer overflows :)

(This was the original impetus behind using a JIT. If you don't JIT a safe language then there's no reason to use JIT at all and we can just use regular binaries. Running in ring 3 would also defeat the purpose of having a JIT, as the point was to eliminate context switches.)
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Designs of microkernels

Post by bzt »

Octocontrabass wrote:Why would running the JIT code as threads stop you from putting those threads in a separate address space from the compiler?
Because it is highly inefficient? How would that look like in practice at all? Webbrowsers forking two processes per tab (to have two separate address spaces for each tab, one for running js and one for the compiler)?
Octocontrabass wrote:Aren't those already covered by PROT_WRITE and PROT_EXEC?
No. First, mmap does not understand those flags only MAP_* ones (which are ignored), so setting PROT_* would take a separate mprotect call, and second POSIX says the MMU protection actually set for those flags are implementation specific (they might not do anything).
moonchild wrote:If you implement a safe language with the JIT, you can't have buffer overflows :)
As I've said, possible BOFs unrelated to JIT. BTW I haven't seen any WASM vm that verifies the bytecode and guarantees to generate 100% bullet-proof native code, and even wasmjit runs the compiled code in ring 0 (yeah, what could go wrong, right?). Do you know any WASM JIT compiler that actually verifies the code and places bound-checks in the generated code? Please let me know if there's any. (I'm not sarcastic, I'm genuinely curious)

Cheers,
bzt
Post Reply