Reinventing Unix is not my problem

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Reinventing Unix is not my problem

Post by bzt »

rdos wrote:I think that is a pretty poor argument. The problem is that if you are not exceptionally compatible with Linux or Windows
You misunderstood. I wasn't talking about binary compatibility, I was talking about source compatibility. One reason why SkyOS was discontinued is that Robert got fed up rewriting each new Firefox release to his interface.
rdos wrote:you will not be able to compile a majority of these even if you have a full libc with Posix. Most of these projects are full of ifdefs and typically require autoconf to even generate makefiles.
Depends how much you support the standard. For an application which is strictly POSIX (at source level), not much ifdefs needed (like porting between Linux and BSDs, many applications don't need any ifdefs at all). On the other hand, an application which should compile for both POSIX and WIN32 API needs hell a lot of ifdefs. So what you say is true, but it does matter how much work porting needs.
rdos wrote:For the projects I've ported, like JPG, PNG and SSL, I've added them to my own SVN repository and then only update them quite seldom.
And have you updated your SSL to the latest? There's a huge security hole in OpenSSL, reported a few days ago. Porting once isn't enough, you also must keep them up-to-date, ideally with automated patching.
nullplan wrote:Isn't MacOS running atop a kernel called XNU, which is very Unix-like?
It's called the Mach-O kernel, and it is so POSIX compatible, that MacOS actually ships FreeBSD userland (all the basic applications, like bash, awk, tar, cut, tr, etc. came from FreeBSD unchanged).
zaval wrote:Given David Cutler's dislike to UNIX, that's hardly a more, than far fetched, wishful thinking.
No, that's the truth. NT kernel doesn't have different concepts than UNIX. Nothing at all. It operates with shared libraries (called DLLs in NT parlance), files, directory hierarchy, etc. With NTFS they have even copied symlinks (called reparse points) and many other things. Processes use separate address spaces and use syscalls to access the kernel. Different implementation, different API, but exactly the same concepts.
rdos wrote:Kind of, and I have to disagree with C being something good too. I don't use C anywhere (except for in some complicated device-drivers), as my view is that either you use assembler or C++. C is a kind of middle-of-the-road alternative that is poor at both low-level stuff and object-oriented programming, and so has no place in my design
Think of C as a "portable Assembly", and you'll see right away how brilliant it is. Assembly isn't portable, and C++ is a terrible monster that 99% programmers don't know how to use correctly. Are you familiar with Ted Tso's quote "handing C++ to the average programmer seems roughly comparable to handing a loaded .45 to a chimpanzee"? (You can use any language you want of course, but it might worth thinking about why all C++ kernels have failed so far. Most notably GNU/Hurd (a kernel written in C++) was replaced by Linux (a kernel written in C) ASAP the latter got mature enough, for a good reason. Haiku, which is C++ all over, doesn't use any C++ features in its kernel, just pure C. C++ is limited to the user interface API only. Take a look at it's source, for example elf.cpp, no namespace, no classes, no "new" keyword, all is pure C. C++ is only used on the parts which userspace can access, but not below.)
rdos wrote:In short, Unix is a design of the 70s, and as such is pretty poor in the 2020s when the hardware is quite different.
This is completely wrong, for several reasons. You're confusing implementation and concept.
a) new hardware does not introduce new concepts at all (maybe memristor will one day, but a new peripheral won't for sure)
b) all OS in 2020 use exactly the same concepts as UNIX did in the '70s.

Cheers,
bzt
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

nullplan wrote:Isn't MacOS running atop a kernel called XNU, which is very Unix-like?
Yes, it is. https://en.wikipedia.org/wiki/XNU
nullplan wrote:
vvaltchev wrote:Given how much Microsoft is investing in Linux, I'd guess it's even possible that one day the NT kernel will be replaced by a sort of Microsoft-patched Linux kernel.
I've heard that before, and... no. No, I don't think that will ever happen. First of all, who cares about the kernel underlying the OS? Hackers. Not normal people. For such a kernel replacement there are only two ways: Make it so that people notice or make it so that people don't notice. If people notice, it will likely be because something isn't working, which is not exactly profit-inducing. If people don't notice, that means MS spent tons of effort porting everything and creating compatibility layers, so that existing applications run. But why would they spend such effort on something that is designed not to be noticed? MS is a for-profit company, they only spend money if it can make them money in the long run. And this kernel swap that people envision could only cost them money.

Plus, NT has features Linux doesn't. It is my understanding that the main NT kernel is a hypervisor, and the various subsystems run as guest operating systems. That is why it was possible to just make a Linux kernel run as yet another subsystem. NT, as it is now, is working fine for what MS is doing with it. There is little impetus to change it.
Well, the theory is that Microsoft is just a single (even if big) company that has to maintain & develop, alone, a huge project like the NT kernel and that costs a "ton" of money, every year. Given that a considerable part of "desktop" users are moving away to other platforms (phones, tablets or other operating systems) and the future indicates that this trend will continue, Microsoft will earn less and less from the Windows product, as a whole. That's why the company is heavily investing in the cloud services and other stuff. Windows is being slowly de-funded, in my understanding. About 10 years ago or so, the company started to de-fund the Windows testing infrastructure, moving towards a "community testing" model with telemetry. Each release has still internal testing (of course), before reaching beta testers etc, but not at the same level as in the past. So, the idea is that the company might consider at some point the idea of writing a compatibility layer and replace their kernel with a kernel supported by a large community. Their support cost will drop by a lot, in the long term. Of course, that would require making everything *so good* that end users won't even notice. If they do notice, as you said, that would be a huge problem.

Said that, I have to clarify that I'm not exactly hoping this will happen: it's just a theory that makes sense, to me. In other words, I wouldn't be surprised if that happened. But actually, I'd prefer this to not happen because competition is a good thing. If we end up with just a single kernel, its quality over time will decrease.

To address also @zaval's comments: no matter my personal preference for UNIX systems, I don't dislike NT or other operating systems at all: actually, I'm happy they exist. I use Windows myself, from time to time, while in the past I used every day for years, without problems. I hope they find a way to make updates somehow faster, but overall Windows 10 is a good OS. I was happy to use MacOSX as well on my MacBook Pro. So, I'm not an extremist "fanboy", guys. As I said in another post, while I like the "UNIX philosophy", I'm not a "worse is better" guy, neither a "MIT" guy: I'm something in between, focused most of the time on practical problems instead of being obsessed with ideals.
Last edited by vvaltchev on Thu Apr 01, 2021 7:36 am, edited 3 times in total.
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reinventing Unix is not my problem

Post by rdos »

bzt wrote:
rdos wrote:I think that is a pretty poor argument. The problem is that if you are not exceptionally compatible with Linux or Windows
You misunderstood. I wasn't talking about binary compatibility, I was talking about source compatibility. One reason why SkyOS was discontinued is that Robert got fed up rewriting each new Firefox release to his interface.
I talked about source compatibility, not binary.
bzt wrote:
rdos wrote:you will not be able to compile a majority of these even if you have a full libc with Posix. Most of these projects are full of ifdefs and typically require autoconf to even generate makefiles.
Depends how much you support the standard. For an application which is strictly POSIX (at source level), not much ifdefs needed (like porting between Linux and BSDs, many applications don't need any ifdefs at all). On the other hand, an application which should compile for both POSIX and WIN32 API needs hell a lot of ifdefs. So what you say is true, but it does matter how much work porting needs.
Exactly.
bzt wrote:
rdos wrote:For the projects I've ported, like JPG, PNG and SSL, I've added them to my own SVN repository and then only update them quite seldom.
And have you updated your SSL to the latest? There's a huge security hole in OpenSSL, reported a few days ago. Porting once isn't enough, you also must keep them up-to-date, ideally with automated patching.
I did a Unix-compatible socket interface (which I generally dislike) just so I could get OpenSSL to build without too many modifications. Still, I never got the patches accepted by the project, and so I will need to copy the new source over my current & redo those patches again manually. OTOH, I don't use it for anything yet.
bzt wrote: No, that's the truth. NT kernel doesn't have different concepts than UNIX. Nothing at all. It operates with shared libraries (called DLLs in NT parlance), files, directory hierarchy, etc. With NTFS they have even copied symlinks (called reparse points) and many other things. Processes use separate address spaces and use syscalls to access the kernel. Different implementation, different API, but exactly the same concepts.
I wouldn't say these are the same concepts. For instance, CreateProcess in NT is much better than fork in Unix, and they are certainly not similar. DLLs are not like shared libraries in Unix, and NT even doesn't use the same executable formats (PE vs ELF). The use of syscalls with ints is a pretty inefficient method that they should NOT have borrowed from Unix. The same goes for ioctl.
bzt wrote:
rdos wrote:Kind of, and I have to disagree with C being something good too. I don't use C anywhere (except for in some complicated device-drivers), as my view is that either you use assembler or C++. C is a kind of middle-of-the-road alternative that is poor at both low-level stuff and object-oriented programming, and so has no place in my design
Think of C as a "portable Assembly", and you'll see right away how brilliant it is. Assembly isn't portable, and C++ is a terrible monster that 99% programmers don't know how to use correctly. Are you familiar with Ted Tso's quote "handing C++ to the average programmer seems roughly comparable to handing a loaded .45 to a chimpanzee"? (You can use any language you want of course, but it might worth thinking about why all C++ kernels have failed so far. Most notably GNU/Hurd (a kernel written in C++) was replaced by Linux (a kernel written in C) ASAP the latter got mature enough, for a good reason. Haiku, which is C++ all over, doesn't use any C++ features in its kernel, just pure C. C++ is limited to the user interface API only. Take a look at it's source, for example elf.cpp, no namespace, no classes, no "new" keyword, all is pure C. C++ is only used on the parts which userspace can access, but not below.)
Kind of. I need to use C in the kernel since I never got to porting C++, and it would not work well with my register-based APIs anyway. Actually, I will typically still implement the API-part in assembly, and then provide C-callable functions for the complex stuff. Even if OpenWatcom supports segmented memory models & defining APIs using register interfaces, it's still rather burdensome and it's much easier to do the interface part in assembly instead. For applications, I almost exclusively use C++, but many C++ gurus probably would claim I used it in the wrong way, but I don't care.
bzt wrote: a) new hardware does not introduce new concepts at all (maybe memristor will one day, but a new peripheral won't for sure)
They do. Things like memory-mapped IO don't work effectively with the file read/write API. They work best with physical addresses.
bzt wrote: b) all OS in 2020 use exactly the same concepts as UNIX did in the '70s.
I don't think so. :-)
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Reinventing Unix is not my problem

Post by bzt »

rdos wrote:I wouldn't say these are the same concepts. For instance, CreateProcess in NT is much better than fork in Unix, and they are certainly not similar.
Now you're talking about implementations and not concepts again. The concept is separated address spaces, filled up with different segments (code and data), that's the same for both. As long as the implementation goes, CreateProcess can be emulated with fork+exec, but fork can't be emulated easily with CreateProcess, so it is questionable, which one is the better. In my opinion the one that can easily mimic the other is the better. Plus creating multiple processes connected with pipes is a hell lot easier with fork than with CreateProcess and its armada or arguments.
rdos wrote:DLLs are not like shared libraries in Unix
Yes, they are. Both are dynamically loaded, and both provide a function library shared among multiple processes. The concept (shared portions of code loaded dynamically as needed) is exactly the same.
rdos wrote:NT even doesn't use the same executable formats (PE vs ELF).
Again, implementation detail. The concept (that both file formats have a code segment and some data segments) is the same, so much, that objconv can convert between ELF and PE without probs. Actually the Linux kernel is capable of executing programs in the PE format with a module, proving that file format is indeed just a small implementation detail.
rdos wrote:The use of syscalls with ints is a pretty inefficient method that they should NOT have borrowed from Unix.
Again, implementation detail. One could have used the sysenter/syscall instructions. The point is, some functions require elevated privilege level, accessed by a special instruction and not via "standard" calls. The concept here is separated user space and kernel space (in contrast to Amiga Exec and Singularity for example, where there's only one address space, and kernel functions are accessed the same way as any other library function.)
rdos wrote:The same goes for ioctl.
Same concept for both UNIX and WinNT. DeviceIoControl operates on opened file handles (just like ioctl in UNIX), and they provide very similar functionality.
rdos wrote:For applications, I almost exclusively use C++
I agree. C++ is good for user space applications and libraries, but not for a kernel IMHO. (But I must put the emphasis on "IMHO", I don't want to say nobody should ever use C++ for kernel development, all I'm saying is it's very hard and does not worth it IMHO.)
rdos wrote:They do. Things like memory-mapped IO don't work effectively with the file read/write API. They work best with physical addresses.
Again, you're confusing implementation with concept. Just take a look at fmemopen or mmap with fd -1, and you'll see that efficient implementations do exists.

In fact, our machines are still von Neumann architectures, mimicing the human society as Neumann originally designed it:
- CPU is the government, making decisions and creating laws (code);
- RAM is similar to the justice system (don't forget that in the USA there's precedence law), remembering the previous results and inputs (data) and storing laws (code);
- and Peripherals are the executor divisions, like the police, firedeps or the military, executing the orders from the two above and reporting (providing feedback on the results) to them.
Just because there are new hardware (like GPU specialized in matrix operations) doesn't mean the concept above has changed. It didn't. Same way, just because CPU's now have non-execute protection did not change the concept of storing code as data (think about JIT compilers). It doesn't matter what interface the peripherals are using (IO ports or MMIO), or that you read bytes from a FERRIT ring or sectors from floppies or from SSDs or from pressured surface with a laser like in CDROMs, the concept of separated peripherals remains. Even if you replace the mouse and keyboard with a mind-reading BCI helmet like in The Foundation, and storage devices with shiny crystals like in Stargate, that wouldn't change the concept.

Now going a step further into software land, you can see that all kernels use the same concepts too, even though their implementations and API differ: they all have files, directories, devices, processes, libraries etc. They are store code in files, and they all load those into address spaces to create processes. All the same.
rdos wrote:I don't think so. :-)
You can also think that Earth is flat, it won't make that true :-) If you peel the kernels from the implementation details, you'll see the concepts under the hood are exactly the same.

Conclusion: we can only talk about if a particular implementation is more effective or easier to use than the others, but for the concepts, there's nothing new under the Sun.

Cheers,
bzt
nexos
Member
Member
Posts: 1078
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Reinventing Unix is not my problem

Post by nexos »

zaval wrote:nexos, c'mon, UNIX didn't create our Universe. It's just a castrated implementation of MULTICS (just say "eunuchs", see, :mrgreen: it's their own joke). The only good thing derived from UNIX is C. :) I agree with the author, that the reason why UNIX cloning is so dominant is due to universities teach exactly UNIX, MINIX etc. /shrugs/
Well, it didn't create the universe for sure :) . But, remember, CS is standing on the shoulders of giants. Ken Thompson stood on the shoulders of Multics, which was probably based off something else, all going back to the ENIAC. Cutler, whether he knows it or not, stood on the shoulders of Ken Thompson, Bill Joy, the guys from Digital, and many other organizations and companies that created great products. Now true, the CS industry is a MESS nowadays, but, guys like Thompson, Ritchie, Joy, Cutler, Wozniak, (and maybe) Torvalds. All actually knew their stuff, but, nowadays, we have a big mess.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reinventing Unix is not my problem

Post by rdos »

bzt wrote:
rdos wrote:I wouldn't say these are the same concepts. For instance, CreateProcess in NT is much better than fork in Unix, and they are certainly not similar.
Now you're talking about implementations and not concepts again. The concept is separated address spaces, filled up with different segments (code and data), that's the same for both. As long as the implementation goes, CreateProcess can be emulated with fork+exec, but fork can't be emulated easily with CreateProcess, so it is questionable, which one is the better. In my opinion the one that can easily mimic the other is the better. Plus creating multiple processes connected with pipes is a hell lot easier with fork than with CreateProcess and its armada or arguments.
It might be true for the user, but for the implementer, fork is like hell. Particularly on multicore systems and with multithreading. Besides, I think it is better to implement both fork and CreateProcess, and to actually emulate fork + exec as a CreateProcess to get rid of all the shared context and provide a consistent environment.

Also, you really cannot emulate CreateProcess with fork. CreateProcess assume that you create a completely new context, and this is impossible with fork that always maintains a nested structure with a root process. In my implementation, I have many roots (created with CreateProcess) that can each be forked. When a fork + exec is performed, the OS makes it look like a new root instead of a hierarchy.
bzt wrote:
rdos wrote:DLLs are not like shared libraries in Unix
Yes, they are. Both are dynamically loaded, and both provide a function library shared among multiple processes. The concept (shared portions of code loaded dynamically as needed) is exactly the same.
DLLs can both be private and shared. I typically only use the private variant, and don't use shared DLLs for the runtime library. DLLs can also be for resources only (for instance, defining texts for different languages) and so don't need to be code & data only.
bzt wrote:
rdos wrote:NT even doesn't use the same executable formats (PE vs ELF).
Again, implementation detail. The concept (that both file formats have a code segment and some data segments) is the same, so much, that objconv can convert between ELF and PE without probs. Actually the Linux kernel is capable of executing programs in the PE format with a module, proving that file format is indeed just a small implementation detail.
Might be, but I wouldn't assign that to Unix. It's a bit of a generic feature.
bzt wrote:
rdos wrote:The use of syscalls with ints is a pretty inefficient method that they should NOT have borrowed from Unix.
Again, implementation detail. One could have used the sysenter/syscall instructions. The point is, some functions require elevated privilege level, accessed by a special instruction and not via "standard" calls. The concept here is separated user space and kernel space (in contrast to Amiga Exec and Singularity for example, where there's only one address space, and kernel functions are accessed the same way as any other library function.)
x86 can use call-gates too, a concept I use indirectly. I let drivers register entry points in a table and code syscalls as far calls to the null-selector, which are patched on first usage to a call-gate. This avoids decoding functions in a common entry-point. It also makes it possible to do syscalls in kernel (those are patched to far calls), and so I can use normal file operations in kernel.

However, no other OS seems to have adopted this method, probably because it only works on x86 and is not supported by Unix. AMD also decide to break this interface in x86-64, possibly because it didn't fit with Unix and Linux.

So, the "Unix standard" is often an obstacle for doing things in better ways that even influences CPU manufacturers.
bzt wrote:
rdos wrote:The same goes for ioctl.
Same concept for both UNIX and WinNT. DeviceIoControl operates on opened file handles (just like ioctl in UNIX), and they provide very similar functionality.
Yes, and it was a pretty poor decision. DOS also supports this horror.
bzt wrote: In fact, our machines are still von Neumann architectures, mimicing the human society as Neumann originally designed it:
- CPU is the government, making decisions and creating laws (code);
- RAM is similar to the justice system (don't forget that in the USA there's precedence law), remembering the previous results and inputs (data) and storing laws (code);
- and Peripherals are the executor divisions, like the police, firedeps or the military, executing the orders from the two above and reporting (providing feedback on the results) to them.
Just because there are new hardware (like GPU specialized in matrix operations) doesn't mean the concept above has changed. It didn't. Same way, just because CPU's now have non-execute protection did not change the concept of storing code as data (think about JIT compilers). It doesn't matter what interface the peripherals are using (IO ports or MMIO), or that you read bytes from a FERRIT ring or sectors from floppies or from SSDs or from pressured surface with a laser like in CDROMs, the concept of separated peripherals remains. Even if you replace the mouse and keyboard with a mind-reading BCI helmet like in The Foundation, and storage devices with shiny crystals like in Stargate, that wouldn't change the concept.

Now going a step further into software land, you can see that all kernels use the same concepts too, even though their implementations and API differ: they all have files, directories, devices, processes, libraries etc. They are store code in files, and they all load those into address spaces to create processes. All the same.
True to some degree, but this is more like near-universal traits of human-made operating systems & hardware than anything you can assign specifically to Unix.
bzt wrote: Conclusion: we can only talk about if a particular implementation is more effective or easier to use than the others, but for the concepts, there's nothing new under the Sun.
The 386 processor added many new concepts, like segmentation and call-gates, but since these concepts worked poorly with the tools you assign to Unix, they have been discontinued in x86-64. So the conclusion must be that Unix hinders sound hardware & software improvements. :-)
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Reinventing Unix is not my problem

Post by Korona »

Segmentation and call gates are maybe interesting but I am not sure if I'd call these ideas "sound". :D
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reinventing Unix is not my problem

Post by rdos »

Korona wrote:Segmentation and call gates are maybe interesting but I am not sure if I'd call these ideas "sound". :D
The market dominance of Windows and Linux also hinders sound hardware development, since many manufacturers are designing their hardware for best compatibility with those systems. This is not particularly good for providing new improved hardware & software. Actually, many hardware devices operates with sub-optimal interfaces because of this.

A very good example is that USB CAN devices are typically implemented on top of a serial port emulation, often using SPI between the CAN controller and the USB controller, resulting in lousy solutions.

Another example is that many odd USB devices are implemented on top of the HID layer since both Windows & Linux provides direct access to this interface from an application. However, this bloats the devices with a HID layer that often is quite useless since everything is done on the control-pipe anyway.

A third example is digital oscilloscopes that typically cannot handle realtime data, rather will implement some slow interface that these OSes make it easy to support without adding new interfaces. They might for instance provide it through a network interface or an USB interface that is standardized. When I made my realtime analyser, I let the PCIe device stream data to main memory and provided a "schedule" through the PCIe BAR. I also added a new interface to read-out the data. Unix or Windows lacks these kind of interfaces, and so we see these suboptimal solutions.
thewrongchristian
Member
Member
Posts: 424
Joined: Tue Apr 03, 2018 2:44 am

Re: Reinventing Unix is not my problem

Post by thewrongchristian »

rdos wrote:
bzt wrote:
rdos wrote:The use of syscalls with ints is a pretty inefficient method that they should NOT have borrowed from Unix.
Again, implementation detail. One could have used the sysenter/syscall instructions. The point is, some functions require elevated privilege level, accessed by a special instruction and not via "standard" calls. The concept here is separated user space and kernel space (in contrast to Amiga Exec and Singularity for example, where there's only one address space, and kernel functions are accessed the same way as any other library function.)
x86 can use call-gates too, a concept I use indirectly. I let drivers register entry points in a table and code syscalls as far calls to the null-selector, which are patched on first usage to a call-gate. This avoids decoding functions in a common entry-point. It also makes it possible to do syscalls in kernel (those are patched to far calls), and so I can use normal file operations in kernel.

However, no other OS seems to have adopted this method, probably because it only works on x86 and is not supported by Unix. AMD also decide to break this interface in x86-64, possibly because it didn't fit with Unix and Linux.

So, the "Unix standard" is often an obstacle for doing things in better ways that even influences CPU manufacturers.

...
bzt wrote: Conclusion: we can only talk about if a particular implementation is more effective or easier to use than the others, but for the concepts, there's nothing new under the Sun.
The 386 processor added many new concepts, like segmentation and call-gates, but since these concepts worked poorly with the tools you assign to Unix, they have been discontinued in x86-64. So the conclusion must be that Unix hinders sound hardware & software improvements. :-)
The official SysV i386 ABI (and probably the 286 ABI previously) specifies lcall via call gates to do system calls.

The reason int based system calls are slow on i386 is due to the horrors of the i386 protected mode model and legacy compatibility, and nothing to do with UNIX. The mechanisms added with the 386 (and the 286) just added complexity and made the associated mechanisms slow. And the mechanisms didn't really map in a portable way to mechanisms provided by other processors, so they were barely used.

It's a bit like the same arguments between RISC and CISC. RISC took what was simple, portable and commonly used, and optimized it at the expense of architectural features that were complex, slow and rarely used as a result.

By contrast, system call trap mechanisms on RISC CPUs are blindingly quick and simple in comparison to i386. MIPS defines just a single mechanism to enter the kernel.
andrew_w
Posts: 19
Joined: Wed May 07, 2008 5:06 am

Re: Reinventing Unix is not my problem

Post by andrew_w »

rdos wrote:The everything is a file concept that I suppose Unix invented is not a great invention. Mostly because they also added the ioctl interface for everything that is not file-related. Which means that every device invents its own ioctl interface that typically is undocumented and incompatible with every other device. The "device-tree" in the file system might seem like a smart idea, but the names are not standardized and it hinders more efficient implementations than just read/write passing buffers.
UX/RT will use separate files for out-of-band APIs rather than implementing ioctl() as a primitive. There will be an ioctl() function compatible with that of legacy Unix, but it will be a library function implemented on top of the separate out-of-band files instead.

Having devices appear in the filesystem doesn't mean that they are necessarily limited to the traditional Unix APIs that copy. Memory mapping is also an option (even under conventional Unix), and under UX/RT there will also be read/write-type APIs that operate on kernel message registers as well as ones that operate on a shared buffer. As long as the filesystem API is sufficient there should be no need to add extra primitives alongside it, since anything can be relatively easily implemented on top of unstructured I/O streams and memory regions. For services where read/write-type APIs are inconvenient for client code to use directly, higher-level library interfaces can be provided on top of the file-based API. UX/RT will provide higher-level library wrappers for every service except those for which direct use of read/write is the only good realization (which is unlike Linux and other conventional Unices where a lot of special files lack any high-level wrapper even though they could use one).
rdos wrote:The fork() API is not a great invention either and gives a lot of headaches for people trying to implement this on multicore systems. It might have seemed smart when it was invented, but today it's not smart. The CreateProcess of Windows is far better.
Both spawn()/CreateProcess() and traditional fork() suck as process creation primitives IMO because they both do way too much. UX/RT will instead have an "efork()" ("eviscerated/empty fork") primitive that creates a completely empty process in a non-runnable state. This will return a VFS RPC file descriptor to the child process context, and the parent will set up the environment of the child process by calling the same APIs that it uses to control its own environment (all APIs that manipulate process state will have versions that take a process context, and it will also be possible to switch the default context for the traditional versions that don't take one). To start the process, the parent will call either the traditional exec() to start the process run another program, or a new exec()-type call that takes an entry point (this of course will assume that appropriate memory mappings have been set up), after which the VFS RPC connection will become a pidfd and cease to accept any new calls from the parent to change the child's state (it will be possible to use it to wait for an exit status from the child). This eliminates the overhead of fork() for spawning other programs and the overhead of spawn()/CreateProcess() for spawning copies of the same program. It also preserves the ease of manipulating the child's context that comes with fork() and doesn't require a primitive with a whole bunch of different arguments and flags to try to include every possible context modification like spawn()/CreateProcess(). Both fork() and spawn() will be easy to implement as library functions on top of efork().
rdos wrote:I think the user account model is outdated, and the ancient terminal emulators are of no interest anymore.
I think user accounts should still be present since you may have multiple people using the same system, but they shouldn't be the primary security model. UX/RT will completely eliminate the traditional root/non-root security model of legacy Unix and replace it with per-process lists of permissions to specific files or entire directories (which may specify to take the permissions from the filesystem instead of providing explicit permissions) and a role-based access control system built on top of them. The setuid and setgid bits on executables will no longer have any effect and will be replaced by a database of permissions.
rdos wrote: A third example is digital oscilloscopes that typically cannot handle realtime data, rather will implement some slow interface that these OSes make it easy to support without adding new interfaces. They might for instance provide it through a network interface or an USB interface that is standardized. When I made my realtime analyser, I let the PCIe device stream data to main memory and provided a "schedule" through the PCIe BAR. I also added a new interface to read-out the data. Unix or Windows lacks these kind of interfaces, and so we see these suboptimal solutions.
It will be easy enough to support such an interface under UX/RT, with a server exporting one or more character- or message-special files for sending commands and configuration, and a memory shadow file covering the buffer to which the analyzer is streaming data. Alternatively if there's no need to multiplex access to the device and you want to avoid any overhead of having a separate server, the application could just access it directly by mapping the files for the device's physical memory regions (along with setting up I/O port permissions if applicable) and setting up an interrupt handler, since UX/RT will be a pure microkernel system and will allow full user access to I/O devices (subject to permissions of course).
nullplan wrote:Plus, NT has features Linux doesn't. It is my understanding that the main NT kernel is a hypervisor, and the various subsystems run as guest operating systems. That is why it was possible to just make a Linux kernel run as yet another subsystem. NT, as it is now, is working fine for what MS is doing with it. There is little impetus to change it.
NT is not a hypervisor (although of course there are several VMMs that run on it). Rather, it is a multi-personality system, which were supposed to be the Next Big Thing™ back in the late 80s and early-mid 90s (although NT wound up being the only really successful example of such a system from that era) in which the base system is personality-neutral and OS personalities are implemented as servers (which are different from guest kernels running under a hypervisor in that they only implement services that are specific to that particular personality and they leave things like scheduling, basic paging, device drivers, network protocols, and disk filesystems up to the personality-neutral base system).

I could see Microsoft possibly replacing NT with Linux because that would mean they would be able to take advantage of the shared development of the Linux kernel rather than having to do everything themselves. Also, Wine provides fairly good (but far from perfect) Windows compatibility for Linux, and NT could still be run under virtualization for stuff that won't run on Wine (they would very likely have to continue to maintain NT alongside the Linux-based replacement but not necessarily add new features to it). They've also started porting some of their applications to Linux as well, so that could be a sign that they're intending to replace NT with Linux. That being said, I'm not sure if they will actually go through with it or not. One serious issue that I could see is the Linux developers' stubborn refusal to stabilize the kernel APIs (which will also complicate things a bit for UX/RT since it will use multiple instances of the Linux kernel as a driver layer through the LKL project).
Developer of UX/RT, a QNX/Plan 9-like OS
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Reinventing Unix is not my problem

Post by bzt »

rdos wrote:It might be true for the user, but for the implementer, fork is like hell. Particularly on multicore systems and with multithreading.
Implementing fork / CreateProcess is hard, agreed, one of the hardest task if you want to do it correctly and efficiently. But I see not much difference in their implementation, except fork can use CoW which simplifies things while CreateProcess can't (you must specify implicitly everything to be added into the new address space). With multicore I see no difference. Same with multithreading, actually threads should be completely userspace and transparent to the kernel (if they are not, then they aren't really threads any more, rather LWPs (lightweight processes), and it doesn't matter to fork if it's copying a shared data or a TLS data to the new address space)
rdos wrote:Besides, I think it is better to implement both fork and CreateProcess
Yes, if you need CreateProcess functionality then you should.
rdos wrote:DLLs can both be private and shared.
I don't know what you mean by "private". If you meant modules specific to a certain application, that works just fine with .so files too. If you meant the visibility of symbols in the shared object, that works just fine too.
rdos wrote:I typically only use the private variant, and don't use shared DLLs for the runtime library.
I don' see how is this relevant. Both DLL and .so can be used like that.
rdos wrote:DLLs can also be for resources only (for instance, defining texts for different languages) and so don't need to be code & data only.
Neither does .so. It's perfectly valid to have only data segments in an ELF (which is what resources are). You can even give names to the segments (like ".icon", ".translation" etc.) I use this to add font to mykernel for example. I link it statically, because this simple kernel example doesn't have a run-time ELF loader, but font.o could be a shared object as well.
rdos wrote:x86 can use call-gates too, a concept I use indirectly. I let drivers register entry points in a table and code syscalls as far calls to the null-selector, which are patched on first usage to a call-gate. This avoids decoding functions in a common entry-point. It also makes it possible to do syscalls in kernel (those are patched to far calls), and so I can use normal file operations in kernel.
Again, implementation detail. The concept is, separate kernel space from user space. It doesn't matter what instruction you use as long as it differs to a normal call.
rdos wrote:AMD also decide to break this interface in x86-64, possibly because it didn't fit with Unix and Linux.
Nope, they obsoleted this interface because WinNT did not use it, and M$ didn't wanted it. M$ influences chip manufacturers (and Intel in particular) a lot more than Linux ever will.
rdos wrote:True to some degree, but this is more like near-universal traits of human-made operating systems & hardware than anything you can assign specifically to Unix.
Sure, that's why I call these OS concepts.
rdos wrote:The 386 processor added many new concepts, like segmentation and call-gates
Those aren't no concepts, just different implementations of protection (executable or not memory). And completely useless may I add, because everybody just set base 0 and limit 4G in them.
rdos wrote:but since these concepts worked poorly with the tools you assign to Unix, they have been discontinued in x86-64.
That's totally false. For one, AMD didn't give a **** about Unix when they designed the x86-64, and Linux was just a hobby for some computer enthusiast back then, not as wide-spread as it is today. AMD removed those features because Windows (their primary objective) didn't needed them.
rdos wrote:The market dominance of Windows and Linux also hinders sound hardware development, since many manufacturers are designing their hardware for best compatibility with those systems.
Now, that's true. Not sure if it's a real problem though, it means more standardization, which means easier hobby OS development too. If you want to experiment with unusual hardware, you have FPGA.
rdos wrote:I think the user account model is outdated, and the ancient terminal emulators are of no interest anymore.
You are completely wrong about that. First, user account model is independent to terminal emulators. User authentication is common, exists with RDP, X11 sessions, and hell, even with HTTP protocol too. Identifying the user and assigning a user account to the current session is the number one step in all security systems, no matter what systems those are. Second, terminal emulators aren't ancient, SSH for example is being used on billions of connections every single day all around the globe. Just ask any server administrator if they are executing PuTTY / ssh several times a day or not. Even M$ is actively developing a new terminal emulator as we speak (see Microsoft Terminal, first commit Aug 11, 2017, last commit today).

Cheers,
bzt
andrew_w
Posts: 19
Joined: Wed May 07, 2008 5:06 am

Re: Reinventing Unix is not my problem

Post by andrew_w »

bzt wrote:
rdos wrote:I typically only use the private variant, and don't use shared DLLs for the runtime library.
I don' see how is this relevant. Both DLL and .so can be used like that.
I think he's talking about libraries that are copied in their entirety for each process that links with them, which is something that OSes that use dlls commonly do. I'm not sure why that would be an advantage though. From what I understand, the reason why DLLs are often not shared is just to work around the limitations of the PE, LX, and NE object formats. I can't think of a good reason not to share read-only parts of libraries and executables.
bzt wrote:
rdos wrote:DLLs can also be for resources only (for instance, defining texts for different languages) and so don't need to be code & data only.
Neither does .so. It's perfectly valid to have only data segments in an ELF (which is what resources are). You can even give names to the segments (like ".icon", ".translation" etc.) I use this to add font to mykernel for example. I link it statically, because this simple kernel example doesn't have a run-time ELF loader, but font.o could be a shared object as well.
Embedding data in executables and libraries is less useful for an OS with a package manager. The only places where I can think of it being a good idea in such a system would be for kernels and the like, and for embedding icons in executables.
bzt wrote:Second, terminal emulators aren't ancient, SSH for example is being used on billions of connections every single day all around the globe. Just ask any server administrator if they are executing PuTTY / ssh several times a day or not. Even M$ is actively developing a new terminal emulator as we speak (see Microsoft Terminal, first commit Aug 11, 2017, last commit today).
Agreed that terminal emulators aren't obsolete. However, I do think that OS-level terminal APIs could use some improvement. UX/RT will bring back the concept of a "listener" from Multics (albeit based on an extensible pseudo-terminal server rather than a sort of wrapper providing lines to the shell through function calls). This will provide a set of rich APIs for completion, editing, and history alongside the traditional terminal interface, replacing library-based hacks like readline. The base pseudo-terminal server will be almost policy-free and most of the advanced features of the listener will be implemented in external helper programs. Serial ports and the console terminal emulator will provide only the basic terminal interface and the listener will run on top of them, whereas GUI terminal emulators and remote shell servers will directly connect to the listener through a superset of the standard ptmx interface.
Last edited by andrew_w on Fri Apr 02, 2021 6:08 pm, edited 1 time in total.
Developer of UX/RT, a QNX/Plan 9-like OS
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reinventing Unix is not my problem

Post by rdos »

bzt wrote:
rdos wrote:It might be true for the user, but for the implementer, fork is like hell. Particularly on multicore systems and with multithreading.
Implementing fork / CreateProcess is hard, agreed, one of the hardest task if you want to do it correctly and efficiently. But I see not much difference in their implementation, except fork can use CoW which simplifies things while CreateProcess can't (you must specify implicitly everything to be added into the new address space). With multicore I see no difference. Same with multithreading, actually threads should be completely userspace and transparent to the kernel (if they are not, then they aren't really threads any more, rather LWPs (lightweight processes), and it doesn't matter to fork if it's copying a shared data or a TLS data to the new address space)
Not so. CreateProcess can initiate the address space knowing that there is no activity there. With fork, you have no idea if other threads are running, and so the whole operation must be atomic or protected with sempahores. For COW, this is required with fork, but not necessary with CreateProcess. There simply is no shared context with CreateProcess that must be split if one of the sides writes to a page. Even the COW handling process must be atomic and reentrant with fork given that multiple threads might try to write a page at the same time on different cores. This is what causes fork to be problematic to support, both when creating it and when later maintaining it.
bzt wrote:
rdos wrote:DLLs can also be for resources only (for instance, defining texts for different languages) and so don't need to be code & data only.
Neither does .so. It's perfectly valid to have only data segments in an ELF (which is what resources are). You can even give names to the segments (like ".icon", ".translation" etc.) I use this to add font to mykernel for example. I link it statically, because this simple kernel example doesn't have a run-time ELF loader, but font.o could be a shared object as well.
Resources in DLLs are not just shared data. They are typed. For instance, you can define string #100 to be some language phrase, and then you create an English version where 100 might be "Hello" and a Swedish where it is defined as "Hej". Then you load the language DLL you want to use and just use the indexes to get the phrase you want. This is a great way to handle different languages in an application. Resources in DLLs can also be icons or binary data, but that is less useful. I use binary data to include boot sectors & boot loaders in the command shell so I can create partitions & format drives.
bzt wrote:
rdos wrote:x86 can use call-gates too, a concept I use indirectly. I let drivers register entry points in a table and code syscalls as far calls to the null-selector, which are patched on first usage to a call-gate. This avoids decoding functions in a common entry-point. It also makes it possible to do syscalls in kernel (those are patched to far calls), and so I can use normal file operations in kernel.
Again, implementation detail. The concept is, separate kernel space from user space. It doesn't matter what instruction you use as long as it differs to a normal call.
A call gate is a normal call. Well, almost anyway since it is a far call.
bzt wrote:
rdos wrote:The 386 processor added many new concepts, like segmentation and call-gates
Those aren't no concepts, just different implementations of protection (executable or not memory). And completely useless may I add, because everybody just set base 0 and limit 4G in them.
I try to avoid that as much as possible, at least in kernel space. The new VFS will remove most of the need to use the flat selector in kernel space.
bzt wrote:
rdos wrote:I think the user account model is outdated, and the ancient terminal emulators are of no interest anymore.
You are completely wrong about that. First, user account model is independent to terminal emulators. User authentication is common, exists with RDP, X11 sessions, and hell, even with HTTP protocol too. Identifying the user and assigning a user account to the current session is the number one step in all security systems, no matter what systems those are. Second, terminal emulators aren't ancient, SSH for example is being used on billions of connections every single day all around the globe. Just ask any server administrator if they are executing PuTTY / ssh several times a day or not. Even M$ is actively developing a new terminal emulator as we speak (see Microsoft Terminal, first commit Aug 11, 2017, last commit today).
Supporting user account when logging into to servers (or when acting as a server yourself) is ok. It's the user account model on the local computer that is outdated. Once I have my ext4 VFS and NTFS VFS I can easily break the security model of both Windows and Linux by booting my own OS on the machine and then being able to access EVERYTHING since I won't support the access control embedded into the filesystem. This is just security by complexity that isn't worth anything.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Reinventing Unix is not my problem

Post by rdos »

andrew_w wrote: I think he's talking about libraries that are copied in their entirety for each process that links with them, which is something that OSes that use dlls commonly do. I'm not sure why that would be an advantage though. From what I understand, the reason why DLLs are often not shared is just to work around the limitations of the PE, LX, and NE object formats. I can't think of a good reason not to share read-only parts of libraries and executables.
I think the answer to that is version incompatibilty. I don't want the mess with version incompatibilty and so I link my executables statically and use DLLs for resources or for isolating interfaces only. Also, the shared libc is huge, and if I write small programs that only use small parts of it, total image size will be a lot less with static linking.
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

rdos wrote:Supporting user account when logging into to servers (or when acting as a server yourself) is ok. It's the user account model on the local computer that is outdated. Once I have my ext4 VFS and NTFS VFS I can easily break the security model of both Windows and Linux by booting my own OS on the machine and then being able to access EVERYTHING since I won't support the access control embedded into the filesystem. This is just security by complexity that isn't worth anything.
The classic user account model on local computers has never been designed to protect against somebody who has physical access to the machine (e.g. you can reboot it and boot another OS): it protects users from unintentionally accessing (or writing) data they shouldn't and it allows per-user customization. For example: me and my wife share some home machines. We don't need "security" against each other on the local accounts, but having different accounts is still necessary: each one has his own account with different settings, files and wallpaper. Also, we cannot unintentionally delete each other's files. All of this it's not broken: it's a valid use case, but I agree that nothing to do with security.

If you want to protect data against people who have physical access to the machine, you have to use encryption. But actually, even that is not enough because experts use any sort of techniques to work around encryption like reading data directly from the memory chips after turning off the machine in a "rough" way, trying to recover the keys. So, if you really want to protect against people who have physical access to the machine, all the time, that's a security nightmare. You probably have to buy special machines designed against such attacks and, still, have to limit yourself a lot on the software you're using. I'm no security expert to tell more, but you get the idea..
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
Post Reply