Exec call in userspace library

NickJohnson · Post by **NickJohnson** » Thu Mar 26, 2009 3:42 pm

I have an interesting idea, and I was wondering if it is at all feasible. I'm in the middle of designing my kernel, and the way I have it, it is hard to do exec() from kernel space properly, for various address space reasons. Then I had an idea: why not make the main C library do execution? It would clear the current address space, load a new image using normal system calls, then set it up. It could implement demand paging or any other cool feature with no changes to the kernel itself, not to mention support for other binary formats or interpreters. The library would be loaded at boot by the kernel (using a very simple flat binary format, probably), and could probably do some sort of bootstrapping of a new library image if put together correctly. It could also implement any kind of dynamic loading setup independent of other tasks.

This does require some other things, most importantly the handling of page faults and other signals/interrupts in userspace. That was already part of my design, so it fit in well.

So would this work well, or have I missed some serious issue? And has any other system implemented something like this?

nekros · Post by **nekros** » Thu Mar 26, 2009 8:06 pm

Letting userspace screw with paging would add instability to the system. I really don't see why you think it's too much trouble to write a exec system call in the kernel.

skyking · Post by **skyking** » Fri Mar 27, 2009 10:11 am

How would you handle suid executables? If all is done in the same context the process would need to be able to change user id - this could easily become a security hole.

Implementing paging in the same context means that you would need the possibility to page out other processes pages, that could also become a security hole.

Microkernels have a lot of functionality moved outside of the kernel, but the kernel still have to have functionality to support the proper behavior of f.x. the exec call. I think most of them have the loader(s) in a separate process.

NickJohnson · Post by **NickJohnson** » Fri Mar 27, 2009 2:25 pm

Well, I don't mean that paging is handled at the *lowest* level by the library. The kernel still does page table manipulation. If a page fault occurs in userspace, control is passed to the kernel, then to a function in the library, which can request memory for a certain page from the kernel or do whatever else it wants. It can't mess with other processes' address spaces or even modify it's own page tables directly. That's how policies like delayed loading of the process image could be separated from the kernel. You could even make a library try and stop segfaults at all costs by allocating memory automatically, but I don't know what you could use that for.

But I hadn't thought about suid executables. I guess I would need some sort of system call to change the user ID, but one that is secure. I could restrict the user ID of a process to becoming one of a lower privilege level (I guess a higher uid), but that removes sudo and su. That is, unless you keep a root user process running at all times (like init) that would spawn a higher privilege level process when prompted, and handle security itself.

Maybe a bit more background on what my design is would be helpful. It's sort of a mix of different design paradigms, so I'm reluctant to call it anything specific. The kernel is very simple: it manages only tasking and paging (there is no swap). It doesn't even have an allocation heap, and resides in just a small sliver of the address space. All I/O drivers are user processes (except maybe for the initrd), which do their own caching. IPC is done with a synchronous message system, which causes interrupt-like function calls in the receiving processes. Messages are only two words long, so large amounts of data are transferred through a simple shared memory system. Various messages (like page faults) are also passed into userspace by the kernel, to remove as much policy as possible. The main C library handles the generally esoteric ABI and makes it compatible with other APIs (like POSIX). It also handles execution, high level mm, and anything else that can be moved out of the kernel.

skyking · Post by **skyking** » Fri Mar 27, 2009 3:47 pm

nickbjohnson4224 wrote:Well, I don't mean that paging is handled at the *lowest* level by the library. The kernel still does page table manipulation. If a page fault occurs in userspace, control is passed to the kernel, then to a function in the library, which can request memory for a certain page from the kernel or do whatever else it wants. It can't mess with other processes' address spaces or even modify it's own page tables directly.

The problem arises when physical memory constrainst demands that another page has to be swaped out, then unless you're allowed to manipulate other processes memory map directly or indirectly the only choice would be to kick out one of it's own page which could be ineffective.

But I hadn't thought about suid executables. I guess I would need some sort of system call to change the user ID, but one that is secure. I could restrict the user ID of a process to becoming one of a lower privilege level (I guess a higher uid), but that removes sudo and su. That is, unless you keep a root user process running at all times (like init) that would spawn a higher privilege level process when prompted, and handle security itself.

But exec when supplied with a suid root program as argument should result in the process with the same pid as the calling process should be run as root. If init has spawned this process there must be a mechanism that replaces the calling process with the newly spawned (that is modify pid, or spawn into an existing pid), and then you have effectively moved the exec implementation to another process (and theres no need to do this in the same context anymore).

NickJohnson · Post by **NickJohnson** » Fri Mar 27, 2009 4:37 pm

skyking wrote:The problem arises when physical memory constrainst demands that another page has to be swaped out, then unless you're allowed to manipulate other processes memory map directly or indirectly the only choice would be to kick out one of it's own page which could be ineffective.

I'm not sure what the problem really is. I wasn't going to implement swapping until much later, but I could just make the kernel handle page faults pointing to pages that are swapped out. The kernel could then force swapouts if necessary and have them reload dynamically. The kernel still has ultimate power; I'm just moving as much as possible out of it.

I do see how it would essentially turn init into a simple process manager to make the system run suid things this way. However, even if that happens, it could be valuable to have the C library do execution. Doing it in a single context would be faster than a call to another process, and be more flexible. Either way it moves the execution out of the kernel: the two ideas are not mutually exclusive.

And who says SUID is the best way to do things? Maybe there is an alternative design that would work with library execution.

skyking · Post by **skyking** » Fri Mar 27, 2009 4:55 pm

nickbjohnson4224 wrote: And who says SUID is the best way to do things? Maybe there is an alternative design that would work with library execution.

Well the point with the suid example is that the extended permission are being allowed only when specific exec-images are executed. The loading of the image and the extension of the permissions are to be inseparable. If you have the mechanism in the system to do these separately the process could use this to execute other code than the code that was allowed to be executed with the extended permissions (any process could become root whenever they wanted for any purpose).

Also POSIX has the twist that files could be execute-only (ie unreadable). How could you load the image if you're not allowed to read it? The exec call should not be hindered by read protection.

NickJohnson · Post by **NickJohnson** » Fri Mar 27, 2009 6:30 pm

That's a really good point. I hate security, but we can't have script kiddies taking down the whole system. However, I just came up with a good solution for both problems that takes advantage of the existing design

.

I could embed a read only (tamper-proof) tag in a specified page in the process that identifies its library. Every executable would also bear a tag, corresponding to the tag of the library it should be executed with. The library would use a special type of read call (just another type of message) to ask an I/O driver for an executable file, and provide shared memory access to its tag. If the driver gets a match between the two tags, it okays the read call. If not, it signals the kernel to replace the library in that address space (which the kernel already must be capable of for booting), then send a type of resume-exec signal to it. The replace library call must be restricted to driver processes, but I was thinking of doing that anyway. The new library would be trusted by the executable, and there would be no security issues. Library images would need to be installed by a administrator, and be indexed by the kernel. Because many files would request the same (or compatible) library as the rest of the system, there would be few of these switches, but it would still leave room for custom libraries as well as non-library-based programs.

I will have my cake and eat it too!

OSDev.org

Exec call in userspace library

Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library

Re: Exec call in userspace library