Korona wrote:Instead of using POSIX-like syscalls, the C library sends an IPC request to a POSIX server to handle things like open(). Once the file is open()ed, the process directly communicates with the FS driver (using the same IPC mechanism). There is no need to have a POSIX-compatible syscall API since instead of having the kernel forward requests, we can just as well directly send the request to the target server.
This is basically the exact same misunderstanding of POSIX I keep hearing from such luminaries as Linus Torvalds: POSIX does not know about system calls. It does not prescribe system calls. It prescribes other concepts, like file names, the file tree, etc., but not system calls. It does define an open() function, certainly, but that can be implemented in any number of ways, and even if the OS offers a SYS_open system call, that does not mean that that call is the only possible implementation of the open() function.
I should probably clarify (or correct myself) that it is the OS in its entirety that needs to be POSIX compatible. The syscalls need to be such that it is possible to build a POSIX library on top, that is what I meant. In your case, you implement open() not with a system call that opens a file, but with a system call that sends a request to the POSIX server, for it to open the file. Cygwin is another attempt at a POSIX library, this time on top of NT. And there, open() is more complicated because they are implementing pseudo file systems in a library. And it does work, it's just slow.
Ethin wrote:Get rid of the overhead of syscalls and just directly send requests to the server via a bidirectional communication channel of some kind.
That does not work. If the channel were some kernel-side channel like a pipe or socket, you still need to call send and receive system calls. If it is shared memory, you need to notify the receiving process, using, you guessed it, a system call. And if it is shared memory, but the receiving process polls for updates, Greta Thunberg will burn your house down. In that case you would be exchanging the system call for massive amounts of needless work.
Ethin wrote:Use a serialization library like serde or rkyv which allow for custom serialization/deserialization implementations but, more importantly, allow data formats to have strong typing guarantees. This means that I can define a structure format like so:
If my C program interfaces with your Rust OS, I can just have it fill the SHM with whatever I want and send the receiving process into confusion. Complexity and security are enemies! I would suggest using a simple system with simple binary formats, so that the receiving process does not have to parse text, and validation is very simple. Decide ahead of time which calls a given server handles and what the arguments look like. Choose a communication channel that preserves datagram boundaries.
One such system is implemented in FUSE. Each request starts with a common header and a request-specific body. The header contains an opcode and a unique 64-bit number. The response also contains a header and a request specific body. And the response header also contains the unique number. This allows the OS to send several requests to the same FUSE server. And almost no strings are sent, if at all possible. Lookup is the only request that contains a string, and ReadDir is the only reply that contains strings. And all strings are Pascal strings, their length is given as a number somewhere else, so you don't have to rely on NUL termination. Of course, libfuse still adds NUL termination everywhere.
That system is not very flexible. Which is a good thing. It allows diverse servers to work on diverse OSes.
Ethin wrote:A normal program wouldn't have access to the buffer, only libc.
That is not a boundary your OS can enforce. Both are userspace. What one can access, the other can access.
Ethin wrote:I don't think there really is a really good way of securing the buffer from tampering.
Yes there is: Make the buffer inaccessible after sending. The simplest idea would be to allocate a page for the data in the sending process, fill it with data, then transfer the page to the receiving process. Then the receiving process can do what it wants and send the page back to the sender with the result. No actual data has to be copied, it's just the page mapping is moved.
If you allow the sender to still tamper with the data after sending it, you will never be able to verify anything. Anything you check could be changed immediately after checking. The receiver would have to copy the request from SHM into its private memory to look at it, which defeats the purpose of SHM.
Ethin wrote:In general the idea was to eliminate syscalls from the equation as much as possible. There's no point in using a syscall if your just going to send something to another userspace process from the syscall.
Well, you won't be able to make do without one without incurring heavy resource costs as outlined above. So might as well bite the bullet.
You might build something like io_uring, but even that uses system calls to notify the kernel of changes and it also has to notify the receiver somehow.