OSDev.org

Posted: **Sun Oct 09, 2011 3:34 pm**

As much as I like managed code, I think there is no to only very little sense to actually use it in kernel land, to some degree.

For running managed code (and I'm not speaking about managed code compiled to native code; there's no real point in doing that) you'll always need an abstraction layer written in native code, call it the virtual machine. This VM needs to care about the low level stuff that is either difficult to implement in managed code or would be a break with the design of the language and the reason why to use managed code in the first place. Now as you have this VM, running directly on the hardware and executing your applications (and drivers!), you already have implemented some kind of microkernel in native code, just like Valix or - to some extend - a project of mine intend to be.

Even if you are going an one address space approach then, the MMU will still be useful for this kind of scenario, because it can not only be used to isolate processes from each other, but to produce a clean, well structured memory layout that would not be possible to implement using physical memory alone without wasting lots of space, because you can easily reserve space in virtual memory for things to grow continuously (for example the stack, just as someone already mentioned in this thread).

My $0.02.

Posted: **Mon Oct 10, 2011 5:53 pm**

I like how people think that managed languages imply a lack of the use of hardware memory protection.

Posted: **Mon Oct 10, 2011 11:26 pm**

gravaera wrote:I like how people think that managed languages imply a lack of the use of hardware memory protection.

As long as the language prevents you from constructing pointers and manages itself properly, you theoretically don't need hardware memory protection, because this is already am adequate form of memory protection. It still may make practical sense to isolate different instances of a VM in separate address spaces, if it's possible that the VM has bugs or security holes, or if you want to be able to run native code easily.

Posted: **Thu Aug 23, 2012 1:23 pm**

I think the trend is obvious, managed code operating system will be in place pretty soon. It's just the natural progress that they use a more modern languages. Does this mean that the MMU will become redundant or there will be no way to run native code programs? What I've heard there are two possibilities, one is that native code programs will run in an emulator with its own operating system, kind of a DosBox solution. The second possibility is that they use virtualization so that native code programs runs on a separate instance of an operating system that allows native code. This however also mean that the MMU is likely to still be used. Which one is best depends on the usage. If interaction between old native code programs and managed code programs is required, then the emulator is likely to be more convenient.

I realize what I'm working on is really old news and if I used managed code I would probably be a magnitude more efficient. However my knowledge about managed code run time environments is very limited so I have to learn the basic stuff first. Also there aren't many freely available manage code languages out there that creates "byte code". We have Java, C#, Limbo but are there any more of them. The osdev wiki is much about classical C and x86 development, maybe we should give hint about managed code as well?

Posted: **Fri Aug 24, 2012 2:22 pm**

Hi,

OSwhatever wrote:I think the trend is obvious, managed code operating system will be in place pretty soon.

There are 3 reasons to use an MMU:

Protection
Massive performance/efficiency improvements (e.g. copy on write, memory mapped files, etc)
Breaking hardware limits (e.g. 2 processes using 3 GiB each, on a CPU that only supports 32-bit addresses and only has 1 GiB of RAM)

There are 2 reasons to use managed code:

Protection
Vendor lock-in (e.g. preventing people from using alternative "untrusted" toolchains so you can sell more copies of your compiler)

There is one big reason to use "byte-code" (which has nothing to do with managed vs. unmanaged):

Portable "binaries"

If you use an MMU for performance/efficiency and/or breaking hardware limits, then you could just use the same MMU for protection too, and the only "good" reason to use managed code (protection) becomes pointless. The most sane approach is using an MMU with some form of "byte-code" (for portability).

Of course managed code may be a good idea if there is no MMU; but that only really applies to low cost embedded CPUs (which aren't really intended to run untrusted code in the first place).

OSwhatever wrote:It's just the natural progress that they use a more modern languages.

Sadly, the "natural progress" in the I.T. industry is for the same ideas to resurface and die again in a cyclic manner. For example; in 1955 someone might think that something is a good idea, a bunch of people will jump on the hype and by 1960 people realise it wasn't as good as the hype made them think and they'll forget it; then in 1975 someone else will come along and a bunch of people that weren't around for the previous "wave of hype" will repeat the cycle with different buzz-words/hype, and then it'll happen again in 1995 and again in 2015.

As far as I can tell, for managed code the current "wave of hype" was started in 2007/2008 by the Singularity project; the wave peaked in about 2010, and the wave is currently collapsing. It'll take a few years before almost everyone stops caring about managed code, and maybe another 15 to 20 years before the next wave of hype starts again.

Cheers,

Brendan

Posted: **Sat Aug 25, 2012 1:32 am**

Casm wrote:How can swapping to hard drive work without an MMU?

Almost exactly the same way.

Casm wrote:How will an expandable stack work?

This would possibly be a bit more expensive to implement in software alone. However, see my comments on COW, below.

Casm wrote:Managed code can only be as stable as the virtual machine it runs on, and since that must itself run directly on the hardware I don't see the point. You are just giving yourself (or somebody else) an extra layer of code to write.

The bugs you speak of also exist in CPU's. The problem is that bugs in CPU's more expensive to fix.

Casm wrote:How does having an operating system, which runs an operating system, which runs applications, manage to be an improvement upon an operating system which runs applications directly?

That's not how it works at all. There's one OS and one compiler, which translates bytecode to native code so that only trusted programs can (safely) run. You can do that with AOT compilation (e.g., translate applications at install time) or JIT compilation (and have translation caches so you don't have to repeat any of the work next time you run the program). You probably shouldn't do it with interpretation if it's a general-purpose OS we're talking about.

Solar wrote:Tossing the idea of unmanaged code because it's no longer "vogue" or "state-of-the-art" (I challenge the latter) certainly feels a bit hasty here.

Whenever someone makes a prediction, everyone tries to imagine whether it would make sense for that prediction to come true the next day. That's not how it works... Things are gradually left behind during a considerable amount of time (e.g., see DOS, Win16 API, and now we've pretty much left BIOS behind; all this means we are basically ready to also leave real mode behind in newer x86 CPU's because the new PC's aren't expected to be able to run legacy code anyway).

Farok wrote:For running managed code (and I'm not speaking about managed code compiled to native code; there's no real point in doing that) you'll always need an abstraction layer written in native code, call it the virtual machine.

So to counter the prediction of having managed CPU's you talk about how these CPU's will need to have a VM layer?

Farok wrote:because you can easily reserve space in virtual memory for things to grow continuously (for example the stack, just as someone already mentioned in this thread)

There's no reason why we still couldn't have a stripped-down MMU (or whatever you want to call this hardware device) that only helps with things like COW and growing stacks so as not to have to keep track of these things in hardware.

Brendan wrote:There are 3 reasons to use an MMU:
Protection

Massive performance/efficiency improvements (e.g. copy on write, memory mapped files, etc)

Breaking hardware limits (e.g. 2 processes using 3 GiB each, on a CPU that only supports 32-bit addresses and only has 1 GiB of RAM)

Let's tackle one thing at a time. I already know you agree with me that the way to scalability is message-passing rather than shared memory. With managed code, your IPC can be as fast as passing a reference (this is lightning fast), at least in the same NUMA domain. So basically, your whole distributed system's foundation is as good as it can be. If you go outside your NUMA domain, then there are other latency problems to worry about with either scheme.

Next, the hardware limit. The advantage you talk about is having a bigger address space that is accessible in a convenient way to the programmer. However, remember, the bigger address space is an illusion---it requires swapping. But swapping can very well be implemented in a managed system, too. In fact, similar technologies are all over the place (overlays in old DOS programs, DLL's in Windows programs, etc.)---except, of course, their goal is different. What about convenience? That doesn't go away either. It doesn't matter what the underlying address space looks like; you're working with the semantics your managed language, which may implement swapping in a transparent way. It's similar to Java, where you don't really see an address space and mourn for a bigger virtual address space. Not only that, but if your program was developed with a 32-bit address space in mind and you suddenly move it to a platform that offers it a 64-bit address space, it can just take advantage of the extra memory without any modification.

Brendan wrote: There are 2 reasons to use managed code:
Protection

Vendor lock-in (e.g. preventing people from using alternative "untrusted" toolchains so you can sell more copies of your compiler)

The vendor lock-in thing is simply not true. The system can work with binaries that come from any toolchain as long as the output is valid bytecode. For example, there's more than one Java compiler that produces Java bytecode.

Brendan wrote:Sadly, the "natural progress" in the I.T. industry is for the same ideas to resurface and die again in a cyclic manner.

You're not being specific enough. Sometimes, old, abandoned ideas re-emerge because they make sense with the new technologies. E.g., embedded systems pretty much went to the same kinds of transitions that microcomputers did, except they came from the opposite direction.

Brendan wrote:As far as I can tell, for managed code the current "wave of hype" was started in 2007/2008 by the Singularity project; the wave peaked in about 2010, and the wave is currently collapsing. It'll take a few years before almost everyone stops caring about managed code, and maybe another 15 to 20 years before the next wave of hype starts again.

You are ignoring the fact that most application code out there is currently managed and this trend is only increasing. This was not true in the past; it is for this reason that it's becoming interesting to look at operating systems from a new perspective.

Posted: **Sat Aug 25, 2012 6:27 am**

Love4Boobies wrote:You are ignoring the fact that most application code out there is currently managed and this trend is only increasing. This was not true in the past; it is for this reason that it's becoming interesting to look at operating systems from a new perspective.

This doesn't seem to be true - at least from an end user applications perspective (i.e. discounting web apps, because the language a web app is written in is immaterial). Most of the applications on my system are not managed code, and 3 of the top 4 spaces on the TIOBE programming language index are filled by native languages (in fact, all C family languages... though this is no surprise) which hold 36% of the market. Meanwhile, the only managed programming language on that list which could be considered to hold significant end-user market shares are C# and Java, which together amount to 22% (but both of those have market shares which are inflated - to some extent - by server side development)

I don't think it can be ignored that C, C++ and Objective-C have all seen major upward trends in use recently.

Posted: **Sat Aug 25, 2012 8:20 am**

I suspect that you are a Linux user, where managed code is less common, except for servers. However, let's look at the OS usage shares, according to Wikimedia accesses: Windows is in the lead with 69.15% (also includes mobile devices, where Windows is far from the lead!). And managed code doesn't stop there; e.g., virtually all phone SIM cards run Java code.

I wouldn't trust TIOBE too much because those statistics are based on the results of Web searches rather than proper investigation. My own experience with Windows is that the number of programs (including games) asking for .NET has significantly increased over the past few years. I don't claim that my experience is a good indication of the current trends either, of course.

Let's just wait and see.

In the meantime, here's a little rant... C isn't gone because, although it's a bad language for programming in general (it asks for too much and offers too little in return), it's still pretty much your best option for embedded programming; it's also the only language for which UNIX systems have an official API binding, unfortunately. C++ has some parts that make sense, but on the whole, it's a horribly complex language and that's always bad. I don't know Objective C.

Posted: **Sat Aug 25, 2012 8:57 am**

Love4Boobies wrote:I suspect that you are a Linux user, where managed code is less common, except for servers.

False (Well, I have two Linux machines, both of which are servers), but my desktop runs Windows, and my laptop Mac OS X.

However, let's look at the OS usage shares, according to Wikimedia accesses: Windows is in the lead with 69.15% (also includes mobile devices, where Windows is far from the lead!). And managed code doesn't stop there; e.g., virtually all phone SIM cards run Java code.

As much as some UICC cards do feature a Java applications platform... the most use I've seen made of it is a menu of links. Hardly a resounding example.

And while SIM/UICC cards can be implemented as "JavaCard" cards, this is rarely done (as JavaCards cost more)

I wouldn't trust TIOBE too much because those statistics are based on the results of Web searches rather than proper investigation. My own experience with Windows is that the number of programs (including games) asking for .NET has significantly increased over the past few years. I don't claim that my experience is a good indication of the current trends either, of course.

I have noticed this also. Of course, None + Some is always a significant increase

.

Jokes aside, I think I have one entirely .net application on my system. There are a few hybrid applications, mostly by Microsoft.

And, of course, if the Win32 documentation on MSDN is anything to go on, I bet most of those "managed" applications are full of P/Invoke calls. Hardly managed

In the meantime, here's a little rant... C isn't gone because, although it's a bad language for programming in general (it asks for too much and offers too little in return), it's still pretty much your best option for embedded programming; it's also the only language for which UNIX systems have an official API binding, unfortunately. C++ has some parts that make sense, but on the whole, it's a horribly complex language and that's always bad. I don't know Objective C.

C++ is horribly complex. Unfortunately, nobody seems to have made a language with the same applicability and feature set which is better (I'll even ignore the C interoperability feature if you want - just to give said language the ability to get rid of many of the issues that plague C++).

Objective-C is a language with two personalities: One C, one Smalltalk. It has manual resource management, but it has a very consistent ownership model (primarily thanks to Cocoa) and good tools for dealing with some of the corner cases (primarily NSAutoreleasePool), plus recently added features like Automated Reference Counting (and, interestingly, the recently dropped feature of garbage collection)

I think it's quite telling that Microsoft, who seem to have the most to gain from managed code, recently introduced C++ AMP as their new concurrency system

Posted: **Sat Aug 25, 2012 10:00 am**

One of the advantages of verifiably memory-safe code is that, in conjunction with a suitably large address space, you can run all applications and the kernel safely together in the same address space, without worrying about one process being affected by errors or malicious code in another process. This guarantee is provided by a subset of Microsoft's CIL (excluding the direct address memory access instructions introduced by the 'unsafe' keyword in C#), and presumably a subset of java too (although I am not familiar with java bytecode). It is much harder to do (but probably not impossible) by a static analysis of C/C++ source, and I would imagine almost impossible for assembly or any native binary.

The single address space OS has the primary advantage of fast IPC, thus allowing a microkernel to become a more realistic proposition.

That is not to say that the MMU is redundant due to the many reasons stated by Brendan (e.g. fast hardware support for swap-to-disk, CoW, mmio etc) and also that you could, in theory, support separate address spaces (accepting the additional cost) for certain individual applications that require a very large heap all to themselves.

Personally I can see the future being a native OS kernel with a byte-code interpreter/JIT running on that which then guarantees the memory safety and process-isolation of each application (e.g. Android). If the particular byte code is portable across platforms so much the better. Running safe code in the kernel itself is difficult due to the need for direct physical memory accesses to deal with devices and probably unnecessary as I would imagine most small kernels could by declared 'safe' by static analysis.

DOI: writing a C# kernel purely to see if its possible rather than as the 'next great thing' (TM).

Regards,
John.

Posted: **Sat Aug 25, 2012 10:26 am**

With 64-bit systems, the page table walks are becoming more and more expensive. x86-64 is now four levels and expected to be increase to 5 when necessary. The page table consumes memory and slows the system more and more down the bigger the virtual address space becomes. In the end the page table doesn't scale and we must find new solutions. Object protection is starting become more and more attractive.

Also, I haven't noticed a recent increase in C,C++ in obj-C. Java, C# and python are languages I see is on rise for application programming. Most people choose them because they can get the job done much faster and bugs are easier to find due to the per object protection.

Posted: **Sat Aug 25, 2012 11:49 am**

You haven't seen a recent increase in Objective-C use?

You obviously haven't spent any time with a Mac or an iDevice. Being as your only option for GUIs on both of those is Objective-C... yeah...

As for Java, C# and Python: Only C# really "works" on the desktop.

Posted: **Sat Aug 25, 2012 4:09 pm**

Hi,

Love4Boobies wrote:
Brendan wrote:There are 3 reasons to use an MMU:
Protection

Massive performance/efficiency improvements (e.g. copy on write, memory mapped files, etc)

Breaking hardware limits (e.g. 2 processes using 3 GiB each, on a CPU that only supports 32-bit addresses and only has 1 GiB of RAM)
Let's tackle one thing at a time. I already know you agree with me that the way to scalability is message-passing rather than shared memory. With managed code, your IPC can be as fast as passing a reference (this is lightning fast), at least in the same NUMA domain. So basically, your whole distributed system's foundation is as good as it can be. If you go outside your NUMA domain, then there are other latency problems to worry about with either scheme.

To me, the biggest problem with IPC is changing from one process' "working set" to another process' "working set" - e.g. fetching cache lines for the second process and flushing (and if modified, writing) cache lines used by the previous process to make room. Paging does make the cost of this "working set switch" a little higher, but "single address space managed code" doesn't avoid it. Given that the cost of a "working set switch" can't be avoided, my approach is to minimise the number of "working set switches". Basically, when message/s are sent they're just put in a FIFO queue so that any "working set switches" can be postponed - e.g. you can send 100 messages and do one "working set switch" rather than doing 100 of these "working set switches".

Now, putting a reference into a FIFO queue is fast, and it doesn't matter much what this reference is. It could be the address of the message data, or it could be the address of a page or page table that contains the message data. Managed code would make little difference.

Essentially what I'm saying is that synchronous messaging is bad; and managed code only really helps if your IPC is bad.

Love4Boobies wrote:Next, the hardware limit. The advantage you talk about is having a bigger address space that is accessible in a convenient way to the programmer. However, remember, the bigger address space is an illusion---it requires swapping. But swapping can very well be implemented in a managed system, too. In fact, similar technologies are all over the place (overlays in old DOS programs, DLL's in Windows programs, etc.)---except, of course, their goal is different. What about convenience? That doesn't go away either. It doesn't matter what the underlying address space looks like; you're working with the semantics your managed language, which may implement swapping in a transparent way. It's similar to Java, where you don't really see an address space and mourn for a bigger virtual address space. Not only that, but if your program was developed with a 32-bit address space in mind and you suddenly move it to a platform that offers it a 64-bit address space, it can just take advantage of the extra memory without any modification.

For "single address space"; you could implement a way of splitting things up into pieces and mapping "most likely to be used" pieces into that single address space (while evicting "less likely to be used" pieces to make room); but you'd essentially be implementing an MMU in software and doing it software is likely to be far worse than using the hardware's MMU.

Love4Boobies wrote:
Brendan wrote: There are 2 reasons to use managed code:
Protection

Vendor lock-in (e.g. preventing people from using alternative "untrusted" toolchains so you can sell more copies of your compiler)
The vendor lock-in thing is simply not true. The system can work with binaries that come from any toolchain as long as the output is valid bytecode. For example, there's more than one Java compiler that produces Java bytecode.

It is possible for managed code to ignore the potential vendor lock-in advantage and use (for e.g.) open standards, etc instead. Just because some people choose to ignore an "advantage" doesn't mean that the potential advantage ceases to exist.

Love4Boobies wrote:
Brendan wrote:As far as I can tell, for managed code the current "wave of hype" was started in 2007/2008 by the Singularity project; the wave peaked in about 2010, and the wave is currently collapsing. It'll take a few years before almost everyone stops caring about managed code, and maybe another 15 to 20 years before the next wave of hype starts again.
You are ignoring the fact that most application code out there is currently managed and this trend is only increasing. This was not true in the past; it is for this reason that it's becoming interesting to look at operating systems from a new perspective.

There are 2 types of people - OS developers that get to choose what native applications for their OS have to use; and "everyone else" (application developers and end users) who have no choice. I'm talking about the former (the people who have a choice) - e.g. the number of OS developers who are interested in making managed code a requirement for native applications.

Cheers,

Brendan

Posted: **Sat Aug 25, 2012 4:39 pm**

Brendan wrote:To me, the biggest problem with IPC is changing from one process' "working set" to another process' "working set" - e.g. fetching cache lines for the second process and flushing (and if modified, writing) cache lines used by the previous process to make room. Paging does make the cost of this "working set switch" a little higher, but "single address space managed code" doesn't avoid it. Given that the cost of a "working set switch" can't be avoided, my approach is to minimise the number of "working set switches". Basically, when message/s are sent they're just put in a FIFO queue so that any "working set switches" can be postponed - e.g. you can send 100 messages and do one "working set switch" rather than doing 100 of these "working set switches".

Now, putting a reference into a FIFO queue is fast, and it doesn't matter much what this reference is. It could be the address of the message data, or it could be the address of a page or page table that contains the message data. Managed code would make little difference.

With multiple address spaces you never get rid of the copying messages from one address space to another though, unless you send entire pages.

Posted: **Sat Aug 25, 2012 5:33 pm**

OSwhatever wrote:With multiple address spaces you never get rid of the copying messages from one address space to another though, unless you send entire pages.

So you make the messages small. Lets, for example, take a look at the VFS.

A traditional, POSIXy VFS exposes the functions

open
close
read
write
sync
etc

Immediately, you can look at these and go

Open, close: Well, can't really get rid of them. Open is going to be relatively expensive (copying of file name), but open is itself a relatively expensive operation anyway
Read, write: These are going to involve expensive copies
sync: Somebody's going to ***** hard if you don't implement a way to sync changes to disks (probably immediately after their database gets corrupted because the power failed and it couldn't sync its chages to disk)

So, you look at it a bit and go "Hmm, I wonder if I can get rid of read/write. They're going to be both really common and quite expensive. Hmm, what about if I make 'files' behave as memory regions, where you can map pages in/out, and then add methods to force changes to disk?"

Or, you move to memory mapping, because memory mapping removes the copying (and, indeed, removes the copying on monolithic kernels also)

And with this, you've eliminated 99% of the copying, because the app can write directly into the VFS cache, and 99% of the messages, because most apps don't need explicit syncs (the implicit on-fclose sync is fine).

Then you look at the network system. You see that read/write/etc are flawed in the same way; so you have the network stack pass the application "buffers" to fill (where a buffer is essentially a memory map plus an offset which behaves similarly to a file descriptor's position), with the buffer position set to just past the length of headers it expects that it either will or might need (because it knows that you're writing to a TCP socket which has its traffic routed over a given interface), and also passes up a "max length" hint so that when you get to 1550 bytes you hopefully don't force the network stack to re-packetize your data.

And then you can look back at this, and you note that nowhere critical does this require any more copying than a monolithic OS or a managed OS does, it often beats the monolithic for it, and that maybe you'll tolerate returning to the monolithic OS copy count for convenience and therefore have both a "blisteringly fast API" for apps which really need it and a "slower but easier and matches existing conventions API" for apps which don't need the full speed (e.g. your IM app or web browser probably isn't going to notice the difference between zero-copy and one-copy networking)

OSDev.org

Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?

Re: Will the MMU become the next redundant HW block?