Pitfalls of lacking a MMU

Love4Boobies · Post by **Love4Boobies** » Wed Nov 30, 2011 7:58 am

AOT and interpretation are two distinct compiling techniques. It's like saying you have an AOT JIT compiler.

Anyway, yours is the technique used by most managed OSes. However, the bigger question is whether this makes sense for programs of considerable complexity---at some point, runtime optimizations based on runtime profiling could really make a difference.

OSwhatever · Post by **OSwhatever** » Wed Nov 30, 2011 4:46 pm

ACcurrent wrote:
Love4Boobies wrote: and managed code (performance depends on the implementation but it's usually better for some things, such as IPC, and worse for some, such as using... arrays).
Well what I have done for my OS (and the reason its taking forever) is create an AOT interpreter which "installs" the app from a bytecode. The generated code is safely guarded through the compiler and every time you launch the app the executable code is verified and launched. NO MMU!

Managed code operating systems seems to be quite a trend these days. MMU is a lovely thing in many ways but so is managed code.

Since you are working in the area, do you have any suggestions how create better processor support for managed code (for example object boundary protection and so on). This idea is very young and the hardware support is sparse.

Love4Boobies · Post by **Love4Boobies** » Wed Nov 30, 2011 7:30 pm

@ACcurrent: You deleted your post while I was writing an answer.

OSwhatever wrote:Since you are working in the area, do you have any suggestions how create better processor support for managed code (for example object boundary protection and so on). This idea is very young and the hardware support is sparse.

There wouldn't be a big difference to it; the hardware would do pretty much the same thing that the software does.

Brendan · Post by **Brendan** » Wed Nov 30, 2011 10:09 pm

Hi,

OSwhatever wrote:Since you are working in the area, do you have any suggestions how create better processor support for managed code (for example object boundary protection and so on). This idea is very young and the hardware support is sparse.

For something like boundary protection, maybe the CPU could allow you to define the start address and end address of each object, and raise an exception if you try to access something outside of the allowed range. Then you could give each of these some attributes, so that (for example) you can say "this object is read-only" and the CPU could raise an exception if you try to write to a read-only object or something. To make it fast you could have some sort of "object identifier", and a table of descriptors that determine the attributes, starting address and ending address of the "object identifier". To make it even faster you could have special CPU registers to hold the object identifiers, and a set of instruction prefixes to tell the CPU which of those "object identifier registers" to use for every read or write.

I wonder why Intel didn't do something like this back in the early 1980's when they were designing the way segmentation would work in protected mode...

Cheers,

Brendan

OSwhatever · Post by **OSwhatever** » Thu Dec 01, 2011 2:21 am

Brendan wrote:Hi,

OSwhatever wrote:Since you are working in the area, do you have any suggestions how create better processor support for managed code (for example object boundary protection and so on). This idea is very young and the hardware support is sparse.
For something like boundary protection, maybe the CPU could allow you to define the start address and end address of each object, and raise an exception if you try to access something outside of the allowed range. Then you could give each of these some attributes, so that (for example) you can say "this object is read-only" and the CPU could raise an exception if you try to write to a read-only object or something. To make it fast you could have some sort of "object identifier", and a table of descriptors that determine the attributes, starting address and ending address of the "object identifier". To make it even faster you could have special CPU registers to hold the object identifiers, and a set of instruction prefixes to tell the CPU which of those "object identifier registers" to use for every read or write.

I wonder why Intel didn't do something like this back in the early 1980's when they were designing the way segmentation would work in protected mode...

Cheers,

Brendan

The question is if this kind of protection would cause just as many or more memory references as it would if an MMU was used. In practice you can do this with segment descriptors on x86 but the allowed number of LDTs of be a limiting factor not to speak how slow it is to load a descriptor every time. It would be interesting to see one of those architectures. It's like going from a accessing general address space to accessing object space identidier + offset.

turdus · Post by **turdus** » Thu Dec 01, 2011 6:08 am

Brendan wrote:I wonder why Intel didn't do something like this back in the early 1980's when they were designing the way segmentation would work in protected mode...

I beg your pardon, segmentation in 286 pmode works exactly as you described. There are objects (segments) with base address and size, writable flag also involved. If you reference outside or try to write on a read-only object, the cpu rises an exception. Identifier of object is selector index. You do have special CPU registers to hold the object descriptors. What's the difference?

turdus · Post by **turdus** » Thu Dec 01, 2011 6:50 am

berkus wrote:It's not scalable just the same. With only 8192 descriptors available the space becomes a bit too tight really quickly.

You forgot about ldts. 8190*8191 is quite a lot of descriptors, 16M memory limit is more likely a bottleneck.

Tyler · Post by **Tyler** » Thu Dec 01, 2011 7:04 am

turdus wrote:
Brendan wrote:I wonder why Intel didn't do something like this back in the early 1980's when they were designing the way segmentation would work in protected mode...
I beg your pardon, segmentation in 286 pmode works exactly as you described. There are objects (segments) with base address and size, writable flag also involved. If you reference outside or try to write on a read-only object, the cpu rises an exception. Identifier of object is selector index. You do have special CPU registers to hold the object descriptors. What's the difference?

= "Oh yeah, they did do that!"

OSwhatever · Post by **OSwhatever** » Thu Dec 01, 2011 9:42 am

turdus wrote:
berkus wrote:It's not scalable just the same. With only 8192 descriptors available the space becomes a bit too tight really quickly.
You forgot about ldts. 8190*8191 is quite a lot of descriptors, 16M memory limit is more likely a bottleneck.

I think the real bottleneck is speed. The descriptors was never really designed for this and loading descriptors is a very slow operation. In a system that supports objects identifiers with boundaries, the descriptors must be cached for speed, a bit like a TLB.

Rusky · Post by **Rusky** » Thu Dec 01, 2011 10:25 am

One problem with hardware memory protection is that you often get stuck with it turned on. No matter how you speed up segments, they'll always be slower than regular (compiler-verified if you want) objects, all else being equal.

The same applies, to a lesser degree, to paging. Paging, however, has other benefits (e.g. virtual memory) that would be impossible highly inefficient/specialized without some kind of hardware support (e.g. to signal when a page needs to be loaded back into RAM).

Combuster · Post by **Combuster** » Thu Dec 01, 2011 6:04 pm

To get over with the segmentation misconceptions:

You have room for slightly less than 2^26 descriptors, which is more than either 2^20 real mode space and even the 2^24 addressable range of the 286. On a 386, you can cover all possible memory in chunks of 128 bytes, much less than the paged 4096 bytes. And then you have not even considered the possibility that you can also modify the tables themselves. The only real downside is that you can't have more than 16k distinct objects without making the process aware of any manual swapping of entries. That still wasn't likely to touch the limits of any 386 at the time.

Segmentation being slow is a similar bit of nonsense: the CPU already caches descriptors that are in use. It's only after the 486 when optimisation became important that the used tools performed best: paging was the more likely candidate. Segmentation only requires a single bus access to change objects, whereas paging requires one to four for each miss depending on the system used (with one being exceedingly rare).

Paging is also inferior to segmentation when considering the capabilities. Segmentation has more levels of protection, can separate permissions for code and data, and has an arbitrarily fine granularity. Anything implementable by paging can be implemented using segmentation, but the reverse is actually not possible.

Segmentation is also not slower than managed code by definition. It is definitely overhead, but so is managed code where garbage collection is mandatory and protection checks have to be manually added by the compiler. The current abysmally slow implementation of segmentation makes the race obviously in favour for any of its opponents.

A monk might have provided you with the following bit of wisdom: paging exists because of laziness.

Rusky · Post by **Rusky** » Thu Dec 01, 2011 8:29 pm

Yes, segmentation can do everything and more than paging can do, but that's getting dangerously close to a turing tarpit. Paging is not there just due to laziness- It simplifies a lot of things in ways that make maintenance and bug-hunting (of the memory system) easier. Virtual memory works well when you can swap fixed-size blocks, for example.

Also note that I only said segmentation is slower than not-segmentation, all else being equal. Managed code often has its own overheads, but they're not prerequisites for having the compiler check for in-bounds memory access. This is just a problem with hardware memory protection in general- even when you can remove the overhead at compile time, the hardware just puts it back.

Brendan · Post by **Brendan** » Thu Dec 01, 2011 9:53 pm

Hi,

Combuster wrote:You have room for slightly less than 2^26 descriptors, which is more than either 2^20 real mode space and even the 2^24 addressable range of the 286.

I'm guessing "2^26" comes from 8192 entries per LDT multiplied by "8192-ish" LDT entries per GDT. It's relatively easy to have one GDT entry (that says where the currently being used LDT is) per CPU and modify it during task switches. In that case, the system-wide limit is "unlimited" (or limited to the amount of RAM).

Combuster wrote:Segmentation being slow is a similar bit of nonsense: the CPU already caches descriptors that are in use. It's only after the 486 when optimisation became important that the used tools performed best: paging was the more likely candidate. Segmentation only requires a single bus access to change objects, whereas paging requires one to four for each miss depending on the system used (with one being exceedingly rare).

That depends a lot on how it's used. For example, if you use one segment per object, and have a linked list of 12345 objects, then traversing that linked list is going to be painful.

In the most general possible terms, I'd question whether very fine grained isolation (e.g. "per object") is worth the extra/unavoidable overhead (regardless of how it's done - with segmentation, managed code or something else); and whether coarse grained isolation (e.g. isolating processes from each other) is a much better compromise between granularity and overhead.

Cheers,

Brendan

Brendan · Post by **Brendan** » Mon Dec 05, 2011 9:15 pm

Hi,

Quick note - I split this to get all the functional programming stuff out of the "lacking an MMU" topic.

Cheers,

Brendan

OSDev.org

Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU

Re: Pitfalls of lacking a MMU