Your exprience on Segmentation vs. Paging

linguofreak · Post by **linguofreak** » Sat Jan 12, 2013 3:46 pm

rdos wrote:Paging is not an effective protection mechanism. It is a complete disaster for protection unless multiple address spaces are also used.

What do you know? Every major OS these days uses multiple address spaces.

Now, I do like the idea of using a segmentation-type mechanism to make switching address spaces faster, and to let programs access multiple address spaces concurrently (take for example the PDP-11, which, while not the best example of what I'd like to see, had four sets of address translations active at any one time: kernel code, kernel data, user code, and user data). But the kind of segmentation scheme that would satisfy me probably wouldn't satisfy you, if you want something that can protect programs from themselves.

Basically, paging only has no better protection than real mode where anything can be overwritten by anybody.

Poppycock. For protecting programs from each other, and protecting the kernel from programs, paging does quite well. Where it doesn't do so well is in protecting programs from themselves (catching array overruns and such).

The thing is, protecting a program from itself is great for debugging (so that you catch an out-of-bounds array index when it occurs, rather than at the end of a seven-level chain of wild pointers), but just slows things down in production code. Therefore, it's a good thing to implement in software, but not so good to implement in hardware. Witness the iAPX 432 and how poorly it did.

For production code, self-protection doesn't really do anything: if there is a pointer bug in any code that runs while a program is executing (whether it be in the program itself or in a library), it will cause failures. Whether those failures cause an immediate segfault (with self-protection) or a delayed heisenbug (without it) is irrelevant to the end user. The program still doesn't do the job he set it to.

linguofreak · Post by **linguofreak** » Sat Jan 12, 2013 3:59 pm

Brendan wrote:
So, someone writing malicious code runs your OS inside Bochs, figures out which addresses they need to mess with in about 5 minutes; and then your OS sets the world record for being pawned.

You can carry out the randomization at boot time (for your kernel) or at load time (for applications/libraries), so the addresses they find on one run in Bochs won't be valid when they try cracking real hardware (or even when they do another run in Bochs).

Of course, rdos's complaint is still belied by the fact that each of the "big three" OS's at least has ASLR available in long mode, even if it isn't default.

linguofreak · Post by **linguofreak** » Sat Jan 12, 2013 5:19 pm

tom9876543 wrote:The Intel 286 Processor was a disaster, it was badly designed.

The big disaster was that 286 protected mode wasn't back compatible with existing 8086 software. For things that didn't have to be back compatible with the 8086, it was decent enough compared to other microcomputer processors of the day (and even for things that did have to be back compatible with the 8086 its real-mode performance was better than the 8086, it just couldn't use any of its protection features in that case).

The GDT was an extremely bad design decision.

For a processor that had to be back-compatible with the 8086, yes. For something without prior back-compatibility requirements, given the era it was designed in and the class of machine it was meant for, it wasn't really that bad.

When it was dragged onto more powerful processors as a back-compatibility feature is when it became burdensome and useless (for anything other than back-compatibility).

Ideally, if the 386 hadn't had any back-compatibility requirements, its segmentation and paging mechanisms could have been designed hand-in-hand.

rdos · Post by **rdos** » Sat Jan 12, 2013 6:26 pm

Brendan wrote: So, someone writing malicious code runs your OS inside Bochs, figures out which addresses they need to mess with in about 5 minutes; and then your OS sets the world record for being pawned.

1. Because of bugs in Bochs, RDOS doesn't run properly inside Bochs
2. RDOS protects itself from tampering in other ways (which I won't tell you here on an open forum)

Besides, the issue of protection is not with malicious code, but stems from a need to compensate from inadequates in C and the flat memory model.

Unlike what Griwes proposes, it is not just bad programmers that is the culpit, but basically every programmer, no matter how experienced, makes errors which in C can have fatal effects like destroying the heap or somebody elses data.

Brendan wrote: In C, each piece of data can be in its own separate segment (including putting each thing marked as "const" in its own read-only data segment), each function can be in its own separate ("execute only") segment, and each piece of memory returned by "malloc()" can be a separate ("read/write") segment; and you wouldn't even need to bend the language's specification slightly.

The issue is that it costs too much, and still doesn't solve the problems. For instance, there is no exact limit checking with paging (because pages are 4k). What would be needed is "safe" objects that hardware make sure cannot be accessed out-of-limits, and that automatically are controlled to be valid and accessible.

Paging only works with C because it is transparent to C, and as such doesn't solve any issues at all. Paging can map part of the address space as invalid, but this kind of protection becomes weaker as programs / kernels become larger.

rdos · Post by **rdos** » Sat Jan 12, 2013 6:34 pm

linguofreak wrote: The thing is, protecting a program from itself is great for debugging (so that you catch an out-of-bounds array index when it occurs, rather than at the end of a seven-level chain of wild pointers), but just slows things down in production code. Therefore, it's a good thing to implement in software, but not so good to implement in hardware. Witness the iAPX 432 and how poorly it did.

For production code, self-protection doesn't really do anything: if there is a pointer bug in any code that runs while a program is executing (whether it be in the program itself or in a library), it will cause failures. Whether those failures cause an immediate segfault (with self-protection) or a delayed heisenbug (without it) is irrelevant to the end user. The program still doesn't do the job he set it to.

The line between production code and debug code is fine. Actually, in practise, there is no sharp line between a debug-release and a production release. A production release could still contain subtle bugs that might only occur rarely, and which aren't caught during the debug-phase. For these bugs you want to create meaningful traces that could lead to them being resolved. A fault in the heap manager, or a secondary fault in another thread is not helpful in that case, and often leads to the bug never being resolved. Or worse, the bug might lead to hangups in the system without any fault information at all.

linguofreak · Post by **linguofreak** » Sat Jan 12, 2013 6:52 pm

rdos wrote:
Brendan wrote: In C, each piece of data can be in its own separate segment (including putting each thing marked as "const" in its own read-only data segment), each function can be in its own separate ("execute only") segment, and each piece of memory returned by "malloc()" can be a separate ("read/write") segment; and you wouldn't even need to bend the language's specification slightly.
The issue is that it costs too much, and still doesn't solve the problems. For instance, there is no exact limit checking with paging (because pages are 4k).

Brendan wasn't talking about paging here. He was talking about using segmentation in C. With protected-mode segmentation you *do* have exact limit checking. You could have a standard-conformant C implementation where malloc() returned a segment selector. An array allocated with such a malloc() implementation could be limit-checked exactly. (OTOH, you'll run out of segments *really fast* that way on an x86).

Paging only works with C because it is transparent to C, and as such doesn't solve any issues at all.

It solves quite a few issues. What it doesn't solve is the non-issue of protecting a program from itself. A program that accesses an array out of bounds is in (almost certainly fatal) trouble whether the hardware (or other software) catches it or not.

rdos · Post by **rdos** » Sun Jan 13, 2013 3:16 am

linguofreak wrote:
rdos wrote:
Brendan wrote: In C, each piece of data can be in its own separate segment (including putting each thing marked as "const" in its own read-only data segment), each function can be in its own separate ("execute only") segment, and each piece of memory returned by "malloc()" can be a separate ("read/write") segment; and you wouldn't even need to bend the language's specification slightly.
The issue is that it costs too much, and still doesn't solve the problems. For instance, there is no exact limit checking with paging (because pages are 4k).
Brendan wasn't talking about paging here. He was talking about using segmentation in C. With protected-mode segmentation you *do* have exact limit checking. You could have a standard-conformant C implementation where malloc() returned a segment selector. An array allocated with such a malloc() implementation could be limit-checked exactly. (OTOH, you'll run out of segments *really fast* that way on an x86).

Certainly. I've had such an implementation running myself. It works until there is some type of objects that need a large number of instances. The major problem here is 286. Since 286 was an 16-bit architecture, it had 16-bit segment registers. When Intel extended the architecture to 32-bits, they forgot to extend segment registers to 32-bit, and because of that segmentation leads to out-of-descriptor problems in some types of applications / user cases.

Griwes · Post by **Griwes** » Sun Jan 13, 2013 7:08 am

rdos wrote:Unlike what Griwes proposes, it is not just bad programmers that is the culpit, but basically every programmer, no matter how experienced, makes errors which in C can have fatal effects like destroying the heap or somebody elses data.

If your system doesn't provide any protection from destroying "somebody else's" data by an application, then your system is broken. Since (in sane systems) virtually everything is running in its own address space, how in the world would you go about destroying "somebody else's" data?

rdos · Post by **rdos** » Sun Jan 13, 2013 9:22 am

Griwes wrote:
rdos wrote:Unlike what Griwes proposes, it is not just bad programmers that is the culpit, but basically every programmer, no matter how experienced, makes errors which in C can have fatal effects like destroying the heap or somebody elses data.
If your system doesn't provide any protection from destroying "somebody else's" data by an application, then your system is broken. Since (in sane systems) virtually everything is running in its own address space, how in the world would you go about destroying "somebody else's" data?

Nobody said anything about applications being able to destroy for other applications or kernel. This is one of the issues that paging (partially) solves, especially in combination with some type of parameter validation (or as in my case, by not letting the application having the full size of it's "flat" selector).

Griwes · Post by **Griwes** » Sun Jan 13, 2013 11:20 am

Then who is "somebody else" you've been talking about?

bluemoon · Post by **bluemoon** » Sun Jan 13, 2013 12:36 pm

The bugs must belong to somebody else.

Perhaps he meant the mechanism to catch some bugs written by someone else, like array out of bounds

linguofreak · Post by **linguofreak** » Sun Jan 13, 2013 4:53 pm

Griwes wrote:Then who is "somebody else" you've been talking about?

From what I gather, he's talking about other code in the same address space, such as shared libraries and other parts of the running program than the one in question.

For example: Function foo() writes beyond the end of an array, clobbering memory that holds a data structure used by bar(). With respect to bar(), foo() is the "somebody else" rdos is talking about.

Rdos wants to provide protection at the per-allocation level, rather than at the per-process level. x86 protected mode segmentation can do this (though the number of selectors isn't really sufficient), but I really don't think being able to provide such protection is as beneficial as rdos thinks it is.

linguofreak · Post by **linguofreak** » Sun Jan 13, 2013 4:56 pm

rdos wrote:Certainly. I've had such an implementation running myself. It works until there is some type of objects that need a large number of instances. The major problem here is 286. Since 286 was an 16-bit architecture, it had 16-bit segment registers. When Intel extended the architecture to 32-bits, they forgot to extend segment registers to 32-bit, and because of that segmentation leads to out-of-descriptor problems in some types of applications / user cases.

Frankly, I don't think the width of segment selectors was sufficient for per-allocation protection even on the 286.

OSDev.org

Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging

Re: Your exprience on Segmentation vs. Paging