Page 1 of 4

Everything is a $ABSTRACTION and Microkernel Hell

Posted: Thu Sep 20, 2007 11:14 am
by Crazed123
We've all seen it in our systems at times. Maybe our process model got too complicated, or we didn't know how to make capabilities revocable. Or perhaps we couldn't figure out how to move kernel-level objects between processes via our elegant portal IPC (OK, that one's mine.).

Eventually, we who try to create elegant systems too often run into Microkernel Hell (not necessarily confined to microkernels, but most common among them): the point at which enhancing the elegance of one part of the system reduces it somewhere else, but we still have to make a choice.

I, for example, would argue that the first-ever instance of Microkernel Hell was the core mechanism of microkernels: message passing.

The most elegant solution is usually to make the OS a little larger and a little less extensible in one place so as to provide a powerful abstraction capable of expressing nearly anything.

For Windoze, that abstraction was COM objects. But COM objects require far too much on the user-land's part and offer no uniform interfaces. For Unix and later Plan 9, it was file I/O. But file I/O cannot properly represent control flow, only data flow. For Amiga, it was Arexx commands. But Arexx ports relied on the absence of memory protection. For Pebble and Kea, it was portals. But portals require butt-ugly interfaces to pass kernel-level data (like memory pages or capabilities) between protection domains and end up not really resembling function calls or system calls at all.

Now my question is: how can we best fill in "Everything is a $ABSTRACTION" to achieve not merely a uniform interface, but true extensiblity. In the ideal case, we can even implement every kernel operation but one in terms of that $ABSTRACTION (portals have this property, but see above for their other problems) without breaking the uniform interface.

From there, manipulating $ABSTRACTION would allow customization and whole or partial remodeling of the OS, as well as bringing any benefits that come with $ABSTRACTION itself (like distributed 9P2000 for file I/O).

Re: Everything is a $ABSTRACTION and Microkernel Hell

Posted: Thu Sep 20, 2007 1:46 pm
by Colonel Kernel
I've heard it argued before that a typical microkernel as implemented most commonly today is an abstraction inversion. That is, the reason your elegant ideas keep getting tangled up is because low-level details keep pervading the entire design.

To use an analogy, consider typical lock-based concurrent programming. If you're designing a component that has to exist in a multi-threaded environment, the details of that threading model pervade every aspect of your design. You have to carefully separate shared from unshared data, figure out how to update multiple related pieces of shared data atomically without causing deadlock, etc. If you don't get it right the first time, you're hooped. To put it another way, lock-based concurrent programming is not composable.

Bringing this back to microkernels, consider this again:
Crazed123 wrote:For Amiga, it was Arexx commands. But Arexx ports relied on the absence of memory protection.
Memory protection is an implementation detail. Why do you need it? For isolation -- isolation is the abstraction, not memory protection.

I believe that memory protection is not composable in the same way that locks are not composable. Address spaces exist at a certain level of granularity in most OSes because they have a certain cost associated with them. This limits the scalability of a system built on such "heavy-weight" processes. Compare this to light-weight processes in Erlang, for example -- you can use them at a very fine level of granularity without compromising scalability, and you get all the benefits of isolation too.

What I've come to realize is that abstracting away details such as concurrency and isolation should not be the job of the OS -- it's really the job of the programming language and run-time environment. The OS should just be the arbitrator and communication mechanism ("software bus" if you will) for the rest of the system. That means that ultimately, we have to ditch unmanaged languages like C and C++ in favour of type-safe alternatives (be it Erlang, Scala, C#, Java, whatever).

IMO, that is the way out of Microkernel Hell.

Re: Everything is a $ABSTRACTION and Microkernel Hell

Posted: Thu Sep 20, 2007 1:55 pm
by Brynet-Inc
Colonel Kernel wrote:That means that ultimately, we have to ditch unmanaged languages like C and C++ in favour of type-safe alternatives (be it Erlang, Scala, C#, Java, whatever).
You my friend, are the living incarnation of evil... :wink:

Instead of eliminating superior languages for limited inferior ones, how about we eliminate inferior programmers? ;)

Sounds like a plan 8)

Re: Everything is a $ABSTRACTION and Microkernel Hell

Posted: Thu Sep 20, 2007 2:23 pm
by Candy
Colonel Kernel wrote:To use an analogy, consider typical lock-based concurrent programming. If you're designing a component that has to exist in a multi-threaded environment, the details of that threading model pervade every aspect of your design. You have to carefully separate shared from unshared data, figure out how to update multiple related pieces of shared data atomically without causing deadlock, etc. If you don't get it right the first time, you're hooped. To put it another way, lock-based concurrent programming is not composable.

I believe that memory protection is not composable in the same way that locks are not composable. Address spaces exist at a certain level of granularity in most OSes because they have a certain cost associated with them. This limits the scalability of a system built on such "heavy-weight" processes. Compare this to light-weight processes in Erlang, for example -- you can use them at a very fine level of granularity without compromising scalability, and you get all the benefits of isolation too.

What I've come to realize is that abstracting away details such as concurrency and isolation should not be the job of the OS -- it's really the job of the programming language and run-time environment. The OS should just be the arbitrator and communication mechanism ("software bus" if you will) for the rest of the system. That means that ultimately, we have to ditch unmanaged languages like C and C++ in favour of type-safe alternatives (be it Erlang, Scala, C#, Java, whatever).
I disagree.

While I agree that concurrency and isolation need to be solved at the programming language level, I disagree on the context you bring with it. There is no need for the programming language to be managed or anything such; the only thing the programming language should and must do is make it easier to use the abstractions. They should be made composable as you've just stated before. I see the future of language hiding the complexity of isolation more in encapsulation within language constructs (chosen constructs) rather than hiding it in language modifications. The operating system must act as an arbiter for all code, a final barrier in system protection.

Make it easy to do it right and people will. The managed approach is the antithesis of the C++ base foundation - make stuff quick and as safe as that allows. The managed approach is based more on making it safe and then quick. Safety is a good point but it has its merits and I think pointless checks as they are being inserted automatically into managed code are not the way to security.

Posted: Thu Sep 20, 2007 4:20 pm
by Crazed123
I just think managed-code operating systems (like Oberon from the Development board) are dismal, evil, horrible ideas because they force software developers to retarget their language toolchains to whatever VM architecture the system uses, often adding signficant overhead.

After all, a C implementation of "ls" doesn't really need the features provided by say... the .NET framework or CIL VM. Why should it have to target them?

Furthermore, why should programs need to be written in the same language to communicate well or be composable (as these "One Managed Language or VM" proposals seem to go for)? I :heart: Bash script exactly because it doesn't require that.

Posted: Fri Sep 21, 2007 9:17 am
by Colonel Kernel
Brynet-Inc wrote:You my friend, are the living incarnation of evil... :wink:
Thank you. ;)
Instead of eliminating superior languages for limited inferior ones, how about we eliminate inferior programmers? ;)

Sounds like a plan 8)
Normally I wouldn't respond to your trolls, but I'd just like to point out that you clearly have no idea how much productivity is lost due to stupid mistakes caused by even decent programmers, never mind inferior ones. The memory management issues that C and C++ force developers to address are a pain in the butt for most userland programming tasks. I tell you this from my 10 years of experience developing commercial software, mostly in C++.

Where exactly do your pointless comments come from, other than an inferiority complex?
Candy wrote:While I agree that concurrency and isolation need to be solved at the programming language level, I disagree on the context you bring with it.
You may be reading more into that context than I intended to say.
There is no need for the programming language to be managed or anything such; the only thing the programming language should and must do is make it easier to use the abstractions.
Drop the word "managed" and replace it with "type-safe", and remember that making it easier to use abstractions involves making it more difficult to use them incorrectly, and I totally agree with you.
They should be made composable as you've just stated before. I see the future of language hiding the complexity of isolation more in encapsulation within language constructs (chosen constructs) rather than hiding it in language modifications.
This is a good philosophy when it comes to language design, but this issue is completely orthogonal to what I was talking about. For example, C# is a type-safe language that has language modifications for new abstractions (e.g. -- LINQ), while Scala is a type-safe language that is generic enough to add powerful new features via libraries alone (e.g. -- Erlang-style Actors). In many ways, Scala reminds me of C++ -- not in terms of syntax or the C part, but in terms of the philosophy behind it (multi-paradigm, extensible syntax, powerful libraries). You should check it out.
The operating system must act as an arbiter for all code, a final barrier in system protection.
Yep, that's what I meant. In a system that uses software isolation, the OS guarantees that isolation by verifying code for type-safety before running it (usually when it's installed, so it doesn't need to be done every time you launch a process).
Make it easy to do it right and people will. The managed approach is the antithesis of the C++ base foundation - make stuff quick and as safe as that allows.
In the next few years, I think hardware will progress to the point where I doubt that a C++ programmer could easily write code that outperforms a sufficiently clever optimizing compiler for a higher-level language. The main reason? Multi-core. In the coming years, the performance of code is going to be defined mostly by its degree of parallelism, scalability, and locality of reference (important for NUMA systems, which I think will become increasingly more prevalent). We're not programming simple Von-Neumann machines any more. So I don't buy this "speed first" argument in favour of C and C++. Secondly, with all the evil malware out there today, I disagree with speed as a priority over safety in the first place. :P
The managed approach is based more on making it safe and then quick. Safety is a good point but it has its merits and I think pointless checks as they are being inserted automatically into managed code are not the way to security.
Are you saying that things like array bounds checking don't achieve security, or that you don't like the sacrifice of performance? If it's the first, I agree that it isn't a full solution, but such type-safe code can at least prevent stupid mistakes like buffer overruns, and if there are run-time checks required to ensure that, I think it's worth it. If it's the second, a good optimizing compiler can actually eliminate a lot of run-time checks from the generated code. In Singularity for example, dynamic loading is not permitted, so all the code that is going to run in a process is available at the time it is verified, compiled to native code, and optimized. This means that the IL-to-native compiler can apply many whole-program optimizations that are much more far reaching than what a typical modern compiler can achieve (be it C/C++, Java, C#, or whatever). Also, there is a lot of research going on into dependent types, which allow the compiler to prove that certain pre-conditions are met at compile time, and thus drop many run-time checks from the generated code.

I think you're imagining a lot of run-time overhead that doesn't need to be there. With a good enough type system and compiler, I think a higher-level language can outperform C/C++ on nearly all tasks.
Crazed123 wrote:I just think managed-code operating systems (like Oberon from the Development board) are dismal, evil, horrible ideas because they force software developers to retarget their language toolchains to whatever VM architecture the system uses, often adding signficant overhead.
I think you're imagining the same phantoms that Candy is. Take Singularity, for example, The only extra run-time overhead (besides GC, which I just don't have time to get into in this post) are some extra array bounds checks and the like, many of which are optimized out before the code runs anyway. There is no "VM" per se, just a type-safe language, compiler, and verifier.
After all, a C implementation of "ls" doesn't really need the features provided by say... the .NET framework or CIL VM. Why should it have to target them?
Depends what you mean by "target them". If you wanted to ensure that your "ls" wasn't actually "ls_secretly_hijack_your_computer", you might want to compile it to an intermediate language that could be verified easily, translated to native code, and optimized at install-time. There is way less overhead involved in this than in today's JIT-compiling CLR or JVM.

If by "target them" you mean "use base class libraries", ls should use whatever it needs, and no more. In Singularity for example (sorry if I sound like a broken record, but it is the system of this type that I'm most familiar with), all system libraries are linked statically at install-time, and then the IL-to-native compiler does "tree-shaking" -- basically it eliminates all dead code that is never used while creating the final native executable. In the end, you only pay for what you use, which is the point of C/C++, isn't it?
Furthermore, why should programs need to be written in the same language to communicate well or be composable (as these "One Managed Language or VM" proposals seem to go for)? I :heart: Bash script exactly because it doesn't require that.
Now here you've hit the big problem I see with this type of system -- choice. I believe in choice, especially when it comes to languages. I also believe that it's possible to go too far in the direction of designing languages with exotic type systems that can be compiled into very efficient code, but are very difficult for the average programmer to understand (Scala tilts a bit too far in this direction IMO). If someone wants to program in a dynamic language like Ruby on such an OS, they shouldn't be unduly punished with crappy performance. I'm not sure what the right trade-off is here.

However, I still believe that this is a problem for language designers to solve, not OS designers. I think this type of OS architecture is the future, one way or another.

Posted: Fri Sep 21, 2007 11:51 am
by Crazed123
I think you're imagining the same phantoms that Candy is. Take Singularity, for example, The only extra run-time overhead (besides GC, which I just don't have time to get into in this post) are some extra array bounds checks and the like, many of which are optimized out before the code runs anyway. There is no "VM" per se, just a type-safe language, compiler, and verifier.
I'm sorry. By "overhead" I meant overhead of wasted human time, not wasted computer time. Thanks to Moore's Law computer time has become cheap enough to "waste" on good things like run-time safety checks, extremely safe type systems, and proof-carrying code.

But it will always cost great amounts of human time to write compilers that generate (for example) LLVM intermediate code instead of machine assembler. Not only do immense collections of legacy compilers like GCC exist, but so do lots of weird little one-language legacy compilers like Free Pascal Compiler or Chicken Scheme. Oh, and writing one's own assembler for a new machine architecture remains easier than trying to port LLVM or even the "designed for bare metal" LLVA toolchain over.

VM-based language systems make too many assumptions about the machines and operating systems they run on, and advanced-but-static languages like Haskell still can't solve the problem of keeping everything on the system correct without either forcing all programs to compile into Haskell (in some cases impossible, in other cases merely exhorbitantly difficult) or just running unsafe code and using hardware protection features like virtual address spaces.

And lastly, binary scanning of arbitrary machine code (a la Go! operating system) is just a ***** to implement and maintain as ISAs change.

Posted: Fri Sep 21, 2007 9:05 pm
by bewing
As Moore's law progresses, machines become more vulnerable to glitches. Gamma rays flip bits in memory. Chips go bad. Electrons tunnel. The machine that you would need to successfully implement an overarching, perfect type-safety system would be so susceptible to glitches that it would crash daily.

At the same time, what happens if a program fails a bounds check? It gets an error termination. What happens if the bounds check isn't there? The program crashes with some other error condition. It comes down to the same thing, and there is no benefit to building in the extra safety, so long as:

if an OS can protect user's data, other user processes, and the outside world from a malicious or idiotic program -- that is good enough. All the rest is overkill.

Posted: Fri Sep 21, 2007 10:10 pm
by Alboin
Colonel Kernel wrote: Normally I wouldn't respond to your trolls, but I'd just like to point out that you clearly have no idea how much productivity is lost due to stupid mistakes caused by even decent programmers, never mind inferior ones.
Linus seems to agree with Brynet. ;)
Colonel Kernel wrote:However, I still believe that this is a problem for language designers to solve, not OS designers. I think this type of OS architecture is the future, one way or another.
It's only the future because once Microsoft has a finished version of Singularity, they'll replace NT with it, and start another 'revolution'.

You can count me out.

Such an architecture is simply too complex. What happened to KISS?

Posted: Fri Sep 21, 2007 10:57 pm
by Colonel Kernel
Crazed123 wrote:I'm sorry. By "overhead" I meant overhead of wasted human time, not wasted computer time. Thanks to Moore's Law computer time has become cheap enough to "waste" on good things like run-time safety checks, extremely safe type systems, and proof-carrying code.
At least we agree on something. :)
But it will always cost great amounts of human time to write compilers that generate (for example) LLVM intermediate code instead of machine assembler.
No technological progress ever comes for free. I'm not saying that everyone and their dog should implement their own intermediate language and automated theorem prover, but somebody's gotta do it.
VM-based language systems make too many assumptions about the machines and operating systems they run on
I thought most ILs were pretty straightforward. Do you have any examples?
and advanced-but-static languages like Haskell still can't solve the problem of keeping everything on the system correct without either forcing all programs to compile into Haskell (in some cases impossible, in other cases merely exhorbitantly difficult)
I agree that enforcing this kind of type safety at the source-code level is just nuts, plus it leaves developers with no choice of language. Not good.
or just running unsafe code and using hardware protection features like virtual address spaces.
It is nice to have as a backup option, but IMO that's what it should be -- a backup.
And lastly, binary scanning of arbitrary machine code (a la Go! operating system) is just a ***** to implement and maintain as ISAs change.
How often do ISAs change? IMO they grow, but don't really change. So what if someone has to add extra support for SSE6 in two years? Like I said before, nothing comes for free... somebody has to do it. If it's not someone writing a verifier, then it will be somebody writing a C/C++ compiler back-end, or someone modifying the task-switching code in all the OS kernels...
bewing wrote:As Moore's law progresses, machines become more vulnerable to glitches. Gamma rays flip bits in memory. Chips go bad. Electrons tunnel. The machine that you would need to successfully implement an overarching, perfect type-safety system would be so susceptible to glitches that it would crash daily.
What does running type-safe code have to do with hardware glitches? I fail to see the connection. How is it any different from the hardware's point of view than running any other kind of code? ECC still works... NMIs still get delivered... I really don't see what you're getting at.
At the same time, what happens if a program fails a bounds check? It gets an error termination.
More likely an exception is thrown that the program can catch and deal with, or not (in which case it is terminated).
What happens if the bounds check isn't there? The program crashes with some other error condition.
No, because if the bounds check isn't there, the verifier proved that it was not necessary, so there wouldn't be a crash...
It comes down to the same thing, and there is no benefit to building in the extra safety
I think you've made some unfounded assumptions somewhere along the line. If you're comparing the consequences of faulty programs being terminated due to a robust isolation mechanism, then the comparison you should be making is in terms of performance, not safety. My point was that a single-address space OS is going to be faster than one that uses the MMU.
if an OS can protect user's data, other user processes, and the outside world from a malicious or idiotic program -- that is good enough. All the rest is overkill.
What do you think "the rest" is? The kind of system I'm talking about fulfills all those goals, and does it with better performance.
Alboin wrote:Linus seems to agree with Brynet. ;)
Well, that's a very sad commentary on Linus then. :twisted:

Seriously, have you noticed how much cursing and how little rational thought there is in Linus' writing? I don't have a lot of respect for him on an intellectual level. Despite his accomplishments, he's always struck me as being someone with a bad case of tunnel vision. I think the only thing he is really capable of doing well is kernel hacking... emphasis on the word "hacking".

BTW, on the subject of the Tanenbaum-Torvalds debate, I declare them both gravely mistaken. :twisted:
It's only the future because once Microsoft has a finished version of Singularity, they'll replace NT with it, and start another 'revolution'.

You can count me out.
I also worry that MS is in a position to be the first to make this kind of OS a commercial reality. I don't have a problem with Singularity technically -- I think it's quite brilliant (as I'm sure you can tell) and the people involved seem genuinely intelligent and dedicated, but I'm worried about what will happen once it makes it into the hands of Ballmer and the same idiots who brought us Vista and the Zune. <cringe/>

I don't know who has the resources to compete in this area though. Type theory and proof-carrying code is nifty but highly complex stuff that requires the likes of MScs and PhDs to understand, let alone implement.
Such an architecture is simply too complex. What happened to KISS?
Ask the original poster. :P

Posted: Sat Sep 22, 2007 10:20 am
by miro_atc
The best $abstraction I have ever seen is 1394 and some related concepts like HAVi.
You may say that this has nothing to do with OS, kernels etc and it will be true, but consider the following enviernoment:
1) You have a network consisting from nodes, where each node can be piece of hardware (real device) or some software abstraction.
2) Every node has some info describing its capabilities etc but more importantly the whole information is available to every one.
3) There are few interfaces that each node can implement. At low level you may have address mapped memory or command registers, or data streams. At high level you may have unlimited number of standards for very complex interfaces such as HAVi (stands for home audio/video interoperabilty).
The basic idea behind HAVi is that you may have one node - let say a TV set in your bedroom and some network. When you turn the TV on you can see if you have other devices in the network and use them. For example you may have a video player in the living room. The whole point is that your TV set can be made from one manufacturer and video from another, so TV is not aware at all about the player's functionality. Nevertheless, if they both follow the standard you will see the player functionality on your TV so you can control both with single remote control.

How this fits in the big picture?
When you design an OS, you use to define how a tread/proccess/application will talk with the OS/kernel. I think this is a mistake, the "nodes" should not think about the network. The network must be abstraction which is transparent. In 1394 one node always talks with other nodes.
Even, when it comes to the network management - it's done by nodes. There are various things to be managed in the network - let say the power. One node can become a power manager but it is just a node. The network is just a set rules who may become a manager and how.
Think about this when you design your kernel - the user does not expect that your kernel will be the manager of everything...
The simplest abstraction can be a messaging system, where the kernel is only the media. The kernel should not be neither a sorce neither destionation for messages. The OS should not be the guy who answers to the questions, it must be only the middle man...
I know... this is the exact opposite of thinking, especialy M$. They try to be *everything* and the result is that a typical windoz application is designed to talk only with the OS. They control everything, so instead of interoperability you think about compatibility with windoz.
Perhaps, this is the only busines model - you have to make people to depend on you. And even if you design a very good OS, the software developers won't like it. Just like the manufacturers doesn't like 1394. You can't make profit from a network card that does not require drivers and it is a single chip solution, so everyone can do it in the same way you do. To make profit they need unique product, so only they can sell it...
The same story with HAVi - the standard is there, but... you have sort of "interoperability" only if you buy everything from a single manufacturer and they prefer to keep it that way...

Posted: Sat Sep 22, 2007 11:00 am
by Avarok
IMHO the perfect abstraction is when a program can know what to expect. When it has clearly defined, and extraordinarily simple interfaces that will only work the right way. :)

There is no abstraction that can do that because the concept of abstraction is to cake some make-up on an ugly 50 year old fat woman. A truly beautiful Operating System wouldn't need makeup; but to get there you need to go beyond the OS :shock: and there be monsters :twisted:

1) I've investigated LinuxBIOS, and the idea is sound. Strip the 60,000 archaic 16 bit BIOS functions out and put some modern code with an ELF loader in. They need help, and they need a few modern systems that can run 'em. They need support from Vendors, and they need testers.

2) Reduce primitives. Modern OS's typically use a process, which is a single trust unit, a single memory unit, a single execution unit, and a single name on the "file system". Separate these things and mix and match them natively, rather than making the primitives "add ons" to the the process. Cool things will happen.

3) Using sentinels as the end of a string or file was a mistake. We should have used a length value. Think long about this one if you care. Slices. digitalmars. crypto. how to fit a structure in a file system.

4) It's time to let go. Unless you care about running your OS on decades old hardware, skip the PIC and go for the APIC. Skip APM and go for ACPI. Skip reverting to vm86 mode and learn to work with the mode you have. Skip floppy controllers and focus on SATA, skip modems and focus on ethernet. Maybe even skip 32 bit mode and go straight for 64. The ST and MMX registers are obsolete, use XMM.

5) Too many OSDev's are writing. Stop. Think. Think more. Talk some. Open notepad. Close it. Spend a week thinking about it. Open notepad again. Close it. Talk more. Think for another six months about it. Think some more. Talk about it. Think some more. then begin with

Code: Select all

org 0x7c00
bits 16
start:
6) Don't ever type a word with a dollar sign in front of it again.

Posted: Sat Sep 22, 2007 11:57 am
by Crazed123
Colonel Kernel wrote: No technological progress ever comes for free. I'm not saying that everyone and their dog should implement their own intermediate language and automated theorem prover, but somebody's gotta do it.
It's a lot easier to port Free Pascal Compiler or Chicken to a new operating system than a new machine architecture. People prefer the option of doing so, and I don't see why the operating system should dictate to its users what they should target their compilers at.
I thought most ILs were pretty straightforward. Do you have any examples?
ILs are pretty straightforward, precisely because they assume (even at the lowest levels) things like a flat virtual address space with at least a small set of system calls (like, for example, sbrk()) accessible.

VM-based operating systems can work, but not based on existing VMs that would impose their own limitations and worldview on the operating system. Instead, the operating system should expose its own abstractions as the primitive environment of the VM and expose its own services as capabilities of the VM.

Besides, safe-code systems are too complex because they push safety verification upwards. Instead of having safety provided by the kernel, you must now maintain and verify an entire toolchain, from the kernel to the runtime libraries to the compilers (JIT and otherwise) and interpreters of your IL.
How often do ISAs change? IMO they grow, but don't really change. So what if someone has to add extra support for SSE6 in two years? Like I said before, nothing comes for free... somebody has to do it. If it's not someone writing a verifier, then it will be somebody writing a C/C++ compiler back-end, or someone modifying the task-switching code in all the OS kernels...
Growth is still dangerous when your operating system needs to scan all code for safety before it runs, because it means that a runnable binary might contain code your OS can't verify the safety of.
Such an architecture is simply too complex. What happened to KISS?
Ask the original poster. :P
What happened to keeping it simple? Well, one branch of the research community went off and worked on microkernels (leading to microkernel hell, which I could *best* describe as the OS version of the Turing Tar Pit), another branch later started work on Safe Code operating systems (which are hideously complex, see above), Bell Labs went and wrote Plan 9 (which is brilliantly simple but lacks a good user interface or end-user documentation), and the rest of the world decided to work on "incrementally evolving" Unix and Windoze, which has given us the crappy feature-piled-on-bugfix-piled-on-feature operating systems we deal with today.

Oh, and somewhere along the line Apple made OSX, putting a multithreading microkernel below the same old crap below a pretty good user interface. But even they could have done far better (see: Amiga, BeOS).

Posted: Sat Sep 22, 2007 12:37 pm
by Colonel Kernel
Crazed123 wrote:I don't see why the operating system should dictate to its users what they should target their compilers at.
Because of the benefits it provides. Everything is a tradeoff. The choice of intermediate language affects toolchain developers. The choice of source language affects all developers for the platform, which is a much bigger and therefore much more important group to support.

You have a funny notion of who the "users" are. :) My Mom is a "user". Compiler writers are a niche audience, and I'm sure there are more than a few who would be happy to target "their" compilers at whatever they are paid to target them at....
ILs are pretty straightforward, precisely because they assume (even at the lowest levels) things like a flat virtual address space with at least a small set of system calls (like, for example, sbrk()) accessible.
Are you thinking in terms of LLVM? I'm only familiar (somewhat) with Java bytecode and MSIL, which both assume support for garbage collection (basically, the "new" operator is in the IL), but aside from that all "memory" is abstracted away to the stack and named variables.

Are you concerned about the inability to efficiently implement languages with native support for continuations (where you want spaghetti stacks instead of the usual linear stacks) -- that sort of thing?
VM-based operating systems can work, but not based on existing VMs that would impose their own limitations and worldview on the operating system. Instead, the operating system should expose its own abstractions as the primitive environment of the VM and expose its own services as capabilities of the VM.
I sort of agree with what you're saying, but I keep getting tripped up over your use of the term "VM". I'm not assuming anything like interpretation or JIT-compilation, or that a specific GC implementation or framework libraries must be used.
Besides, safe-code systems are too complex because they push safety verification upwards. Instead of having safety provided by the kernel, you must now maintain and verify an entire toolchain, from the kernel to the runtime libraries to the compilers (JIT and otherwise) and interpreters of your IL.
You've missed the point of code verification. The ultimate goal would be for safety to be provided by a verifier, that's it. The source-to-IL and IL-to-native compilers can be buggy or downright malicious, but if the generated proof is not valid or the code doesn't match the proof, it will not run. The verifier is part of the OS, not part of any single toolchain. Again, I'm not assuming that there's an interpreter or JIT compiler.

In Singularity for example, the ultimate goal is for the "trusted" code of the entire OS to include only the kernel, verifier, and parts of in-process run-time libraries like the GCs (there is research into type-safe GCs but it may be a ways off).
Growth is still dangerous when your operating system needs to scan all code for safety before it runs, because it means that a runnable binary might contain code your OS can't verify the safety of.
If it can't verify it, it won't run it, or it will run it in a sandbox (i.e. -- separate address space). Where is the danger here exactly...?

While I agree that the theory behind safe-code OSes is complex, I disagree that the architecture it leads to is complex. I actually think it's very simple, once you see how the pieces fit together. I like the idea of being able to prove that code is safe, even if I don't understand exactly how that works. :)

Posted: Sat Sep 22, 2007 1:11 pm
by Avarok
Code provability, when done strictly at compile time will deny several key optimization strategies and mess up your ability to obfuscate code in unpredictable ways; this alone should give Microsoft the shivers as that's all commercial vendors can do now to slow down reversers for a few day Unless you plan on strictly banning reversing software like IDA Pro and Ollydebug.

Actually, retooling your provability mechanism might serve as a perfect reversing tool.

Code provability in a dynamic environment means checking things at run time, and then you're back with the rest of us if you correctly take advantage of the hardware mechanisms to do this.

I thought to leverage some strategy to soften the advantage you were trying to obtain by carefully managing PD's and invlpg instructions accross context switches on threads to avoid TLB misses rather than haphazardly performing a mov cr3. I'm not sure, but it isn't efficient enough yet in my head.

Sincerely.