Page 1 of 6
To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 4:17 am
by glauxosdever
Hi,
After
this recent thread, I started questioning what I've been saying since about the last 2.5 years. Maybe I should do a U-turn after all and do everything I've talked against since then? Maybe a UNIX-like OS, with some software written from scratch instead of being ported over? I don't know.
For me, "success" means for the OS to either have some curious users (not necessarily full time ones) or influence how other OSes do stuff, while having some technical merits. Both of the ORed "prerequisites" can be accomplished with both POSIX and a fully custom design (maybe even more easily when doing POSIX). But having technical merits can be harder when you have to adhere to POSIX.
However, now that I think of it, one could still improve on the Linux ecosystem while being POSIX (let's ignore the BSDs for now). Linux is a big monolithic kernel, I think a microkernel would be less messy and would allow drivers to be installed more easily. Most Linux distributions use glibc as the standard library, and I've heard many times and from many sources that it's unnecessarily bloated. Missing asynchronous calls could be implemented aside of POSIX (e.g. async open), thus allow software to do work in the same thread while waiting for the OS to complete the request. Security can still be enhanced aside the usual 9-bit flags for files. Or the desktop environment would be much more lightweight (ironically, the Pluma text editor uses 40 MiB of RAM when just an empty unsaved file is ready to be edited). Or other stuff. But the problem I see is that most ported software wouldn't be designed for it.
What do you think?
Regards,
glauxosdever
Re: To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 7:40 am
by bzt
Hi,
Yes, glibc is bloated big time. But it's not the default libc under Linux anymore. Many have already switched, and every day a new distro switches to
musl.
POSIX is a dying dinosaur itself; it was not designed, it evolved from many different OSes, which means it has lots of controversy and legacy stuff you won't need anymore. The benefit to stick to it is to make porting of existing software easier, but that's all.
Not using POSIX opens up many new opportunities, but designing an OS interface is not easy, requires lots of experience (and more). The advantage here is you're free to do whatever you want, although the basic scheme would remain the same: file operations, networking functions etc. You can have new functions sure, little alterations too, but basically you'll need a file open function with a path argument, and a read and write with an opened file context and buffer address. There's not much to change about those. You won't invent a better quick sort or binary search algorithms either. It is more like what you want to include from the existing stuff in your user space library and how you organise it.
So the question is more like what would you want to do with your OS? Make existing software available? Use POSIX. Experiment with new possibilities? Do not use POSIX.
I for example use the middle path approach: I create a totally new interface, without any legacy or backward compatibility issues; but at the same time I try to keep the POSIX interface as much as possible. For example: I have strlen(), strnlen() and mbstrlen() in my string.h just like in POSIX, but I do not have mblen() in stdlib.h instead I have mbstrnlen() also in string.h for function name and function grouping consistency. (Seriously, what a lunatic have though it would be fun to put mblen() in stdlib.h instead of string.h, and why did the POSIX group accepted that at all? How come nobody noticed it's a string function with the same arguments as strnlen()? Were everybody high or what? This simple example tells a lot about POSIX.)
Cheers,
bzt
Re: To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 9:06 am
by glauxosdever
Hi,
bzt wrote:POSIX is a dying dinosaur itself; it was not designed, it evolved from many different OSes, which means it has lots of controversy and legacy stuff you won't need anymore. The benefit to stick to it is to make porting of existing software easier, but that's all.
I agree that POSIX isn't that good (otherwise I wouldn't be talking against it since 2.5 years), but I'll probably disagree it's a dying dinosaur (the "dying" part, not the "dinosaur" one). Most of the server systems are running some kind of UNIX-like OS (usually Linux), so it's far from dying. And, to be honest, I'm not seeing it going away anytime soon.
Not using POSIX opens up many new opportunities, but designing an OS interface is not easy, requires lots of experience (and more). The advantage here is you're free to do whatever you want, although the basic scheme would remain the same: file operations, networking functions etc. You can have new functions sure, little alterations too, but basically you'll need a file open function with a path argument, and a read and write with an opened file context and buffer address. There's not much to change about those. You won't invent a better quick sort or binary search algorithms either. It is more like what you want to include from the existing stuff in your user space library and how you organise it.
So the question is more like what would you want to do with your OS? Make existing software available? Use POSIX. Experiment with new possibilities? Do not use POSIX.
I'm aware of the implications of using or not POSIX, and what these implications include or not. Apart from organisation, it's also semantics, and some POSIX semantics are quite bad too.
I for example use the middle path approach: I create a totally new interface, without any legacy or backward compatibility issues; but at the same time I try to keep the POSIX interface as much as possible. For example: I have strlen(), strnlen() and mbstrlen() in my string.h just like in POSIX, but I do not have mblen() in stdlib.h instead I have mbstrnlen() also in string.h for function name and function grouping consistency. (Seriously, what a lunatic have though it would be fun to put mblen() in stdlib.h instead of string.h, and why did the POSIX group accepted that at all? How come nobody noticed it's a string function with the same arguments as strnlen()? Were everybody high or what? This simple example tells a lot about POSIX.)
That middle ground approach might be OK if the stuff you don't implement is anyway bad or not used, or simply unimplemented because the entire set of this stuff is unimplemented. Otherwise, I'd be wary of it.
As for mblen(), yeah, it's bad organisation. There also is mbstowcs() and mbtowc() in stdlib.h, along with other functions that "probably have no better place". I however think it's worse that strcpy(), strcat(), wcscpy() and their siblings are included, while strlcpy() still hasn't made it (and thus GNU doesn't get bothered to implement it). Another weird thing is that the standard way to spawn a process is to clone the entire address space just to overwrite it (fortunately copy-on-write makes it tolerable). Or that the access time of a file has to be updated each time it's read. Or that reading 4096 bytes may return with 1234 bytes and call it a success. Or that "recursive" can be denoted with different option letter, i.e. "rm -r", "ls -R", "mkdir -p". But these are probably minor problems.
Regards,
glauxosdever
Re: To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 10:40 am
by Korona
I am on mobile, so I'll just send some short remarks; I will gladly expand on my points if some questions arise.
For managarm, I went the "custom kernel and driver API but POSIX emulation in userspace" route. This has worked remarkably well for me. The internal API is fully asynchronous (the only blocking syscall is FutexWait). The POSIX API (including signals, mmap() and stuff like epoll) works entirely in userspace. Indeed, I went further than just emulating POSIX, I emulate a lot of Linux, for example the DRM API to change graphics modes and enough of sysfs to run eudev.
For a libc, you can always join our effort with mlibc (it's easier to make progress when more people contribute
. It basically tries to implement the glibc API while being portable. So far, it has been ported to two OSes. I will not claim that it is anywhere near feature completion, but it can run gcc, binutils, coreutils, nano, eudev, mesa, cairo, Wayland, Weston, ...
Re: To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 11:01 am
by Octacone
Ditch UNIX and POSIX and everything similar (even GRUB). Go custom, that's the only right choice.
Going POSIX means going Linux aka you would just create another Linux compatible that wouldn't differ from the competition.
Going custom means endless possibility for creativity and improvements and not having to follow ancient outdated guidelines or standards.
For example, my kernel is monolithic, has a custom bootloader, is not UNIX like, doesn't have anything to do with POSIX, doesn't use any third party libraries, has (will have) a custom C library.
I also think that any current third party library is bloated with a ton of crap (POSIX included) that is meant for some other OS (aka Linux/UNIX form the 90s).
Following POSIX means subconsciously learning how Linux works and doing another Linux clone, why not spend your time contributing to the Linux source then, you would basically be doing the same?
Another thing I hate is the "standard" C library naming, I don't want my functions to be called SIGTTOU or SGSFDFMF, to someone that isn't used to coding on Linux it would be a pain to figure out what that names exactly mean. I prefer calling my functions so a total newbie could figure them out, aka PMM.Allocate_Block or VMM.Map_Virtual_To_Physical, so you know what's it for.
Re: To POSIX or not to POSIX
Posted: Mon Apr 08, 2019 1:54 pm
by Korona
IMO, the prescribed choice between creativity and implementing POSIX is a false dichotomy. I wrote enough about my own OS above. If we look at other OSes, we see that implementing POSIX really does not constrain diversity at all. Today, Windows is probably the OS that is closest to Linux: after all, it implements large chunks of the Linux API via WSL. Yet, Windows and Linux are entirely different. The same goes for Linux and MacOS (both implement POSIX but the latter is a microkernel). I cannot really comment on Linux vs. FreeBSD (which also has a Linux subsystem and some kernel ABI compatibility with Linux), but I do expect it to be substantially different. The design space is really huge here.
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 3:19 am
by bzt
Hi,
Octacone wrote:Going POSIX means going Linux aka you would just create another Linux compatible that wouldn't differ from the competition.
I agree. I would like to add not just a Linux compatible, but "you would just create yet another UNIX clone".
Korona wrote:IMO, the prescribed choice between creativity and implementing POSIX is a false dichotomy. I wrote enough about my own OS above. If we look at other OSes, we see that implementing POSIX really does not constrain diversity at all. Today, Windows is probably the OS that is closest to Linux: after all, it implements large chunks of the Linux API via WSL. Yet, Windows and Linux are entirely different. The same goes for Linux and MacOS (both implement POSIX but the latter is a microkernel). I cannot really comment on Linux vs. FreeBSD (which also has a Linux subsystem and some kernel ABI compatibility with Linux), but I do expect it to be substantially different. The design space is really huge here.
I'd like to disagree. Windows is not POSIX at all, it is using its own executable format, therefore it's not the closest. FreeBSD does not have a Linux subsystem btw. The executable format, the dynamic linker, the libc functions etc. all are the same, the only difference is different syscall numbering. So to execute a Linux binary under BSD, all you need is a different syscall lookup table, there's absolutely no need for another subsystem.
https://www.bsdcan.org/2018/schedule/tr ... 76.en.html
And I still think following POSIX blindly kills creativity. Yes, there's a limited freedom in the way how you implement POSIX, but not how you think about the whole OS abstraction.
For example, VMS (now called OpenVMS) implemented recursive paths, something unthinkable in POSIX (this means that recursion is not specified by "-r" / "-R" / "-p" etc. flags, but by the special "..." directory, and as this is handled by the FS layer therefore directory recursion is available to all user space applications at once). It's non-shell compatible scripting language, DCL supports string substraction which I have never seen in any UNIX (and I have to tell you is extremely useful, like: datafile = exefile - ".exe" + ".dat"). Another example, next to IBM's JCL the POSIX jobs ("&" and "kill %1" etc.) are just a bad joke. Or you could compare pthreads to Solaris' LWP, where the kernel knows about the threads and optimises for them in the scheduler for example. And the list just goes on and on...
Cheers,
bzt
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 7:31 am
by glauxosdever
Hi,
Korona wrote:For managarm, I went the "custom kernel and driver API but POSIX emulation in userspace" route. This has worked remarkably well for me. The internal API is fully asynchronous (the only blocking syscall is FutexWait). The POSIX API (including signals, mmap() and stuff like epoll) works entirely in userspace. Indeed, I went further than just emulating POSIX, I emulate a lot of Linux, for example the DRM API to change graphics modes and enough of sysfs to run eudev.
That also works; but then the usual software that assumes POSIX won't use your asynchronous APIs (which are probably more efficient than synchronous POSIX ones). My point is, one of the reasons for me not to do a POSIX system is that software would be designed specifically around my OS API and would be better optimised for it. Also, I would like to avoid having the need to port not-the-best quality software like GTK or gcc because users naturally expect it (where "naturally" is meant as a "natural progression from 'just POSIX' to 'the full ecosystem' "). But, given from that you emulate a lot of Linux, I can conclude it's one of your goals, and it's totally fine.
Octacone wrote:Ditch UNIX and POSIX and everything similar (even GRUB). Go custom, that's the only right choice.
Going POSIX means going Linux aka you would just create another Linux compatible that wouldn't differ from the competition.
Going custom means endless possibility for creativity and improvements and not having to follow ancient outdated guidelines or standards.
For example, my kernel is monolithic, has a custom bootloader, is not UNIX like, doesn't have anything to do with POSIX, doesn't use any third party libraries, has (will have) a custom C library.
I also think that any current third party library is bloated with a ton of crap (POSIX included) that is meant for some other OS (aka Linux/UNIX form the 90s).
Following POSIX means subconsciously learning how Linux works and doing another Linux clone, why not spend your time contributing to the Linux source then, you would basically be doing the same?
Another thing I hate is the "standard" C library naming, I don't want my functions to be called SIGTTOU or SGSFDFMF, to someone that isn't used to coding on Linux it would be a pain to figure out what that names exactly mean. I prefer calling my functions so a total newbie could figure them out, aka PMM.Allocate_Block or VMM.Map_Virtual_To_Physical, so you know what's it for.
I wouldn't say not doing POSIX is the only right choice. Software compatibility may be sometimes desired more than technical merits (the latter of which can still exist in a POSIX system to some degree, e.g. less memory consumption or some additional security features). But indeed, POSIX dictates some stuff, and that's why bzt speaks of "limited freedom".
So, I'm ditching the idea of doing a UNIX-like system (again). But then maybe a lot of standard C stuff can go because they aren't that good anyway. Or even C itself can go away, it's a good opportunity to do that (after all, if C was redesigned today from scratch, it would be much different than the latest C standard). Where would you draw the line?
Regards,
glauxosdever
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 9:00 am
by glauxosdever
Hi,
Actually, I'm still undecided. Maybe I could do a POSIX system after all due to having a limited amount of time (no time for experimenting with APIs, languages, compilers and build systems). And I would actually like to do some OS development already, even if the end result is somewhat bad. But then again, I don't know.
Regards,
glauxosdever
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 9:52 am
by Korona
bzt wrote:I'd like to disagree. Windows is not POSIX at all, it is using its own executable format, therefore it's not the closest.
That's exactly my point: Windows and Linux are completely different, yet Windows implements most of the Linux API (branded as Windows Subsystem for Linux, WSL) and is only rivaled by FreeBSD (and maybe managarm
) in this regard.
bzt wrote:FreeBSD does not have a Linux subsystem btw. The executable format, the dynamic linker, the libc functions etc. all are the same, the only difference is different syscall numbering. So to execute a Linux binary under BSD, all you need is a different syscall lookup table, there's absolutely no need for another subsystem.
https://www.bsdcan.org/2018/schedule/tr ... 76.en.html
For FreeBSD, I was referring to linuxkpi, which is a FreeBSD kernel subsystem that mimics Linux' in-kernel ABI and allows Linux drivers (in particular, DRM drivers) to run in the FreeBSD kernel.
bzt wrote:And I still think following POSIX blindly kills creativity. Yes, there's a limited freedom in the way how you implement POSIX, but not how you think about the whole OS abstraction.
I agree with bzt that following POSIX
blindly is a horrible idea. But that is true for any existing standard. I do think that your examples about VMS etc. make a lot of sense and show that POSIX is not the pinnacle of good design.
glauxosdever wrote:That also works; but then the usual software that assumes POSIX won't use your asynchronous APIs (which are probably more efficient than synchronous POSIX ones). My point is, one of the reasons for me not to do a POSIX system is that software would be designed specifically around my OS API and would be better optimised for it. Also, I would like to avoid having the need to port not-the-best quality software like GTK or gcc because users naturally expect it (where "naturally" is meant as a "natural progression from 'just POSIX' to 'the full ecosystem' "). But, given from that you emulate a lot of Linux, I can conclude it's one of your goals, and it's totally fine.
Implementing a POSIX subsystem does not mean that I can never write native apps -- it just means I can also run existing apps.
It's as simple as that.
The time required to support POSIX is not huge compared to the time required to build a stable core system. In managarm, the POSIX emulation is barely 10k SloC, compared to 30k in the kernel and drivers.
glauxosdever wrote:Actually, I'm still undecided. Maybe I could do a POSIX system after all due to having a limited amount of time (no time for experimenting with APIs, languages, compilers and build systems). And I would actually like to do some OS development already, even if the end result is somewhat bad. But then again, I don't know.
Producing something is always better than producing nothing (regardless whether that means producing a POSIX-compatible or a POSIX-incompatible OS). I really do believe that the perfect is the enemy of the good.
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 11:06 am
by linguofreak
Korona wrote:I am on mobile, so I'll just send some short remarks; I will gladly expand on my points if some questions arise.
For managarm, I went the "custom kernel and driver API but POSIX emulation in userspace" route.
But is implementing POSIX in userspace really *emulating* it? AFAICT, POSIX pretty much entirely bears on userspace. POSIX gives a list of functions that need to be present in a conforming implementation, but it does not, AFAIK, specifiy that said functions need to correspond 1:1 with calls into the kernel.
Of course, some POSIX syscalls may easy to implement as a userspace function that calls into a differently-structured kernel API, while others may require a lot more effort. The primary example I can think of is fork(), which apparently the Cygwin team found very difficult to implement with Win32 process spawning semantics (of course, now it's implemented natively in WSL). So if you're planning on having a POSIX layer at any level, you should probably have a kernel syscall that implements fork() directly. But then, fork() is one of the things that I think that Unix does right, and that even a completely unposixy kernel with no plans for even a userspace POSIX layer should have.
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 11:16 am
by linguofreak
Korona wrote:
Implementing a POSIX subsystem does not mean that I can never write native apps -- it just means I can also run existing apps.
It's as simple as that.
I think the point is that a POSIX layer will discourage 3rd party developers from developing for the native API rather than just using the POSIX layer.
OTOH, if your name isn't Linus, the response to the above sentence with regards to your hobby OS is "what 3rd party developers?".
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 12:04 pm
by nullplan
linguofreak wrote:But is implementing POSIX in userspace really *emulating* it? AFAICT, POSIX pretty much entirely bears on userspace. POSIX gives a list of functions that need to be present in a conforming implementation, but it does not, AFAIK, specifiy that said functions need to correspond 1:1 with calls into the kernel.
POSIX also prescribes concepts and the semantics of these functions. The effect of pthread_create(), for instance, must be to create a concurrent thread. POSIX will tell you what a thread is, and what a process is, and your OS needs to have
something approximating these concepts. As evidenced by Linux, even small deviations can mean a major complication for userspace in supporting POSIX semantics.
For example, POSIX defines credentials (UID, GID, etc.) to be set for the process, whereas in Linux they are thread-local. Which means that all the libcs have to implement setuid() etc. such that in the multi-threaded case, it sends a signal to all other existing threads, which they handle by calling the setuid() system call themselves. It's a major pain. Worse, due to race conditions, musl for instance resorts to actually iterating over /proc/self/task in order to find all threads. It's not nice code. It bails out if /proc/self/task can't be opened, which apparently happens under load.
And all just because Linux makes something thread-local that was supposed to be process-global.
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 12:23 pm
by bzt
Hi,
I think you guys are saying the same thing
Korona's WSL example and linguofreak's Cygwin example both are good examples that a POSIX emulation layer can be done in userspace even for a non POSIXy kernel (please don't mind me writing emulation here, as I have no better word for it. That's clearly not an emulation per se under FreeBSD, but could be one under Windows, depends on point of view). Also that fork() example and nullplan's pthread example shines light on the fact that it is easier to port POSIX libc if the kernel interface was written with POSIX in mind, but it's not an impossible task if it wasn't, just more difficult.
I think this is a good way. For example under MacOSX, you could install Gimp from homebrew, using the POSIX and X11 interface, but later its source was expanded and recompiled for the native OSX API, thus creating a dmg installer. I see no reason why these steps could not be done for any application for any hobby OS.
So if you don't mind, I'd like to draw the conclusion here, let's hope the OP agrees: a hobby OS should not tie itself to POSIX, instead it should experiement with new features; and if porting existing software became apparent then as a first step the hobby OS should provide a user space POSIX library which would hide the OS specific stuff. This also means that ported POSIX software will not be able to use the OS' specific features (like async calls or LWP) unless they are rewritten as native applications, but imho that's totally acceptable.
Cheers,
bzt
ps.: for my kernel one of the biggest challange is POSIX groups, as I don't have that. I use access control lists, which could be considered as many group memberships each with it's own rwx access bits. No way I could handle that with getgid() / setgid().
Re: To POSIX or not to POSIX
Posted: Tue Apr 09, 2019 12:33 pm
by Korona
That's indeed a good summary! I agree completely.