In situations where managed code (in the sense of compiler-verified memory access, not wrt GC, et. al.) is acceptable, it's actually a pretty easy decision. When your compiler can verify, with no runtime overhead, that all memory accesses are in bounds, why would you not use that method?Brendan wrote:In the most general possible terms, I'd question whether very fine grained isolation (e.g. "per object") is worth the extra/unavoidable overhead (regardless of how it's done - with segmentation, managed code or something else); and whether coarse grained isolation (e.g. isolating processes from each other) is a much better compromise between granularity and overhead.
Functional Programming (was Pitfalls of lacking a MMU)
Functional Programming (was Pitfalls of lacking a MMU)
Re: Pitfalls of lacking a MMU
Hi,
Cheers,
Brendan
If a compiler can verify, with no runtime overhead, that all memory accesses are in bounds, then:Rusky wrote:In situations where managed code (in the sense of compiler-verified memory access, not wrt GC, et. al.) is acceptable, it's actually a pretty easy decision. When your compiler can verify, with no runtime overhead, that all memory accesses are in bounds, why would you not use that method?Brendan wrote:In the most general possible terms, I'd question whether very fine grained isolation (e.g. "per object") is worth the extra/unavoidable overhead (regardless of how it's done - with segmentation, managed code or something else); and whether coarse grained isolation (e.g. isolating processes from each other) is a much better compromise between granularity and overhead.
- The code being compiled is so simple that it doesn't matter what you do, or
- The language used is so restrictive that it's virtually useless for anything of reasonable complexity, or
- The performance of the resulting code is so bad that you wish you could've used a language that does run-time checks instead, or
- You've broken the world record for the most complex (and most error prone) compiler in history
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
False. Learn about dependent types.
For example, see this paper, this paper on eliminating array bounds checks with dependent types, and this paper on applying it to imperative languages, along with a lot more of that guy's work. Among other things, he applies dependent types, and with them compile-time array bounds checking, to ML, a widely used and decidedly not useless language. This is done conservatively, meaning it doesn't restrict what the language can do.
Without restricting things to super-simple examples, restricting the language, or over-complicating the compiler, they get performance increases of 12-79%, depending on the algorithm, by eliminating array bounds checks at compile time. Of course, there are still runtime checks of some variables, but only the ones that are necessary and only when they're not redundant. There's plenty of example code in those papers, and it doesn't require lots of arcane type incantations (in fact, some of the work in dependent types is on type inference).
For example, see this paper, this paper on eliminating array bounds checks with dependent types, and this paper on applying it to imperative languages, along with a lot more of that guy's work. Among other things, he applies dependent types, and with them compile-time array bounds checking, to ML, a widely used and decidedly not useless language. This is done conservatively, meaning it doesn't restrict what the language can do.
Without restricting things to super-simple examples, restricting the language, or over-complicating the compiler, they get performance increases of 12-79%, depending on the algorithm, by eliminating array bounds checks at compile time. Of course, there are still runtime checks of some variables, but only the ones that are necessary and only when they're not redundant. There's plenty of example code in those papers, and it doesn't require lots of arcane type incantations (in fact, some of the work in dependent types is on type inference).
Re: Pitfalls of lacking a MMU
Hi,
The "Rusky impersonator" that posted the previous message was talking about "When your compiler can verify, with no runtime overhead, that all memory accesses are in bounds", not "When your compiler can verify, with some runtime overhead, that most memory accesses are in bounds".
Cheers,
Brendan
Might be a good time to change your forum password.Rusky wrote:Of course, there are still runtime checks of some variables, but only the ones that are necessary and only when they're not redundant.
The "Rusky impersonator" that posted the previous message was talking about "When your compiler can verify, with no runtime overhead, that all memory accesses are in bounds", not "When your compiler can verify, with some runtime overhead, that most memory accesses are in bounds".
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
Might be good to learn some computing theory.
It's not overhead if it's actually impossible to get by, securely, without it. The checks that are necessary are for things outside of what the compiler can possibly know in this universe, such as reading random bytes from the network. The compiler verifies, without requiring unnecessary or redundant runtime checks, that all memory access is in bounds, not only some.
Yes, that's all, 100%, every memory access. All of them. They are all verified to be in range at compile time. This is partly accomplished by disallowing them in certain branches where addresses are not guaranteed to be in bounds. Yes, that means there will be runtime checks, but there will not be extra, redundant, overhead-causing ones.
It's not overhead if it's actually impossible to get by, securely, without it. The checks that are necessary are for things outside of what the compiler can possibly know in this universe, such as reading random bytes from the network. The compiler verifies, without requiring unnecessary or redundant runtime checks, that all memory access is in bounds, not only some.
Yes, that's all, 100%, every memory access. All of them. They are all verified to be in range at compile time. This is partly accomplished by disallowing them in certain branches where addresses are not guaranteed to be in bounds. Yes, that means there will be runtime checks, but there will not be extra, redundant, overhead-causing ones.
Re: Pitfalls of lacking a MMU
Hi,
Cheers,
Brendan
Think of it in terms of certainty. If the code says "let i = 3" then the compiler can be reasonably certain that "i == 3". If the documentation for a kernel API function says "This function will only ever return values from 0 to 3", then the programmer can be reasonably certain that the kernel API function will only ever return the values from 0 to 3, but the compiler can't/won't know that. For IO, depending on where the data comes from (e.g. reading from an "untrusted" file vs. IPC from another instance of the same executable) the programmer may be certain about the values the data contains, but the compiler can't/won't know. Then there's things that in theory the compiler could know, but in practice the compiler doesn't know because it's too complicated to figure out (e.g. no sane programmer will ever expect any compiler to generate 100% optimal code in every possible case, including removing unnecessary checks - 100% optimal code is like absolute zero in that it exists but isn't achievable in practice).Rusky wrote:Might be good to learn some computing theory.
It's not overhead if it's actually impossible to get by, securely, without it. The checks that are necessary are for things outside of what the compiler can possibly know in this universe, such as reading random bytes from the network. The compiler verifies, without requiring unnecessary or redundant runtime checks, that all memory access is in bounds, not only some.
I don't think you understand the difference between 100% and 99.9999995%.Rusky wrote:Yes, that's all, 100%, every memory access. All of them. They are all verified to be in range at compile time. This is partly accomplished by disallowing them in certain branches where addresses are not guaranteed to be in bounds. Yes, that means there will be runtime checks, but there will not be extra, redundant, overhead-causing ones.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
This is why I gave you those papers. If, instead of merely documenting the return value's range in a comment, you put that restriction in the function's type, the compiler can verify it and then rely on it for further analysis. This is more difficult when e.g. crossing IPC boundaries, but not impossible, depending on the language/compiler design.Brendan wrote:Think of it in terms of certainty. If the code says "let i = 3" then the compiler can be reasonably certain that "i == 3". If the documentation for a kernel API function says "This function will only ever return values from 0 to 3", then the programmer can be reasonably certain that the kernel API function will only ever return the values from 0 to 3, but the compiler can't/won't know that. For IO, depending on where the data comes from (e.g. reading from an "untrusted" file vs. IPC from another instance of the same executable) the programmer may be certain about the values the data contains, but the compiler can't/won't know.
What are some examples of this, in the case of bounds checking? I've given some pretty strong support for dependent types' abilities, you've just asserted that some things are "too complicated" for the compiler.Brendan wrote:Then there's things that in theory the compiler could know, but in practice the compiler doesn't know because it's too complicated to figure out (e.g. no sane programmer will ever expect any compiler to generate 100% optimal code in every possible case, including removing unnecessary checks - 100% optimal code is like absolute zero in that it exists but isn't achievable in practice).
Re: Pitfalls of lacking a MMU
Hi,
Assume that the caller is in a separate object file.
Cheers,
Brendan
Here's one:Rusky wrote:What are some examples of this, in the case of bounds checking? I've given some pretty strong support for dependent types' abilities, you've just asserted that some things are "too complicated" for the compiler.Brendan wrote:Then there's things that in theory the compiler could know, but in practice the compiler doesn't know because it's too complicated to figure out (e.g. no sane programmer will ever expect any compiler to generate 100% optimal code in every possible case, including removing unnecessary checks - 100% optimal code is like absolute zero in that it exists but isn't achievable in practice).
Code: Select all
int myArray[1234];
int foo(int index) {
return myArray[index];
}
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
When compiling this code: The compiler reports an error that index can be out of range. To fix it, you either specify in the type of foo that index must be in [0,1234), or you add a check for whether it's in range. No knowledge of other files necessary.
In the first case, when compiling the caller, you apply the same logic to whatever gets passed to foo. If it comes in from the network, for example, you have to add a runtime check at some point. If it comes from some other function (like that hypothetical API function that returns [0,3]), the compiler can use that information as well.
I'm beginning to think you didn't so much as glance at any of those papers or even google dependent types...
Code: Select all
int myArray[1234];
int foo(int index) {
return myArray[index];
}
In the first case, when compiling the caller, you apply the same logic to whatever gets passed to foo. If it comes in from the network, for example, you have to add a runtime check at some point. If it comes from some other function (like that hypothetical API function that returns [0,3]), the compiler can use that information as well.
I'm beginning to think you didn't so much as glance at any of those papers or even google dependent types...
Re: Pitfalls of lacking a MMU
Hi,
Here's a slight variation:
Are you going to write a compiler that has a special integer type for numbers "from 0 to 9, or from 16 to 25, or from 32 to 41, or ...."?
Do I need to get more complex (maybe "return myArray[ sqrt(value) & 251];")?
Cheers,
Brendan
So to avoid having any run-time checks, you force the caller to check if the value they use for "index" is within range before calling (possibly during run-time), or you check that "index" is within range inside "foo()" at run-time.Rusky wrote:When compiling this code:The compiler reports an error that index can be out of range. To fix it, you either specify in the type of foo that index must be in [0,1234), or you add a check for whether it's in range. No knowledge of other files necessary.Code: Select all
int myArray[1234]; int foo(int index) { return myArray[index]; }
Here's a slight variation:
Code: Select all
int myArray[10];
int foo(int value) {
return myArray[value & 15];
}
Do I need to get more complex (maybe "return myArray[ sqrt(value) & 251];")?
I did glance at the second one briefly. To be perfectly honest, I saw "ML" and put it in the "wanky academic crap" pile..Rusky wrote:I'm beginning to think you didn't so much as glance at any of those papers or even google dependent types...
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
How about a compiler with general types for integers that satisfy certain conditions? sqrt is trivial to deal with in that way; bitwise operators are a little trickier but still possible AFAIK. (If not, how useful is indexing an array with the result of a bitwise and anyway?)Brendan wrote:Are you going to write a compiler that has a special integer type for numbers "from 0 to 9, or from 16 to 25, or from 32 to 41, or ...."?
Do I need to get more complex (maybe "return myArray[ sqrt(value) & 251];")?
Think of those functions and operators as having more specific types that specify their domain and range more precisely. The compiler will end up having to do some computation, but that's addressed in the literature...
In any case, my original point still stands that a well-designed managed language can gain a substantial performance increase through dependent types, and that hardware memory protection is unnecessary and harmful in that case.
If you could get over your pretensions, you would probably sound a bit less silly.Brendan wrote:I did glance at the second one briefly. To be perfectly honest, I saw "ML" and put it in the "wanky academic crap" pile..
Re: Pitfalls of lacking a MMU
Hash tables (if modulo is overkill)?Rusky wrote:How about a compiler with general types for integers that satisfy certain conditions? sqrt is trivial to deal with in that way; bitwise operators are a little trickier but still possible AFAIK. (If not, how useful is indexing an array with the result of a bitwise and anyway?)Brendan wrote:Are you going to write a compiler that has a special integer type for numbers "from 0 to 9, or from 16 to 25, or from 32 to 41, or ...."?
Do I need to get more complex (maybe "return myArray[ sqrt(value) & 251];")?
"Rain increases the chance of car accidents. To improve safety, I invented a "functional car" - as soon as it detects the smallest drop of water it jams on the breaks, turns of the engine and refuses to move. People laugh at my "functional car" and call it funny names (like "non-functioning car"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."Rusky wrote:If you could get over your pretensions, you would probably sound a bit less silly.Brendan wrote:I did glance at the second one briefly. To be perfectly honest, I saw "ML" and put it in the "wanky academic crap" pile..
"State increases the chance of programming mistakes. To improve "correctness", I invented "functional programming" - as soon as it detects the smallest amount of state, it spits out an error message and refuses to compile. People laugh at my "functional programming" and call it funny names (like "non-functioning programming"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Pitfalls of lacking a MMU
Modulo is significantly closer to what you mean (not to mention significantly simpler to handle with dependent types), and any optimizing compiler from the past few decades ought to be able to perform strength reduction on it.Brendan wrote:Hash tables (if modulo is overkill)?
... and there you go again. ML does, in fact, allow mutable state. In addition, one of the papers I linked to applies dependent types to a regular old imperative language. And further, if that's your attitude toward functional programming, I'm sorry.Brendan wrote:"Rain increases the chance of car accidents. To improve safety, I invented a "functional car" - as soon as it detects the smallest drop of water it jams on the breaks, turns of the engine and refuses to move. People laugh at my "functional car" and call it funny names (like "non-functioning car"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."
"State increases the chance of programming mistakes. To improve "correctness", I invented "functional programming" - as soon as it detects the smallest amount of state, it spits out an error message and refuses to compile. People laugh at my "functional programming" and call it funny names (like "non-functioning programming"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."
Re: Pitfalls of lacking a MMU
Hi,
How about, to avoid run-time checks, internally the compiler converts "int foo(int value)" to "int_with_standard_range foo(int_that_accepts_values_1to3_45_23_1983_6to15_4321to4444_elephant_and_69)", in a similar way to the name mangling C++ does to ensure type correctness during linking?
In the same way, I should've said "pure functional car", so people don't get confused and think I'm talking about cars that were wrapped in plastic to work-around the idiocy of the primary design goal (preventing people from driving in the rain).
The third paper applies dependent types to some contrived language they had to invent specifically for the purpose - it's hardly "regular" or "old".
Cheers,
Brendan
You're avoiding the issue.Rusky wrote:Modulo is significantly closer to what you mean (not to mention significantly simpler to handle with dependent types), and any optimizing compiler from the past few decades ought to be able to perform strength reduction on it.Brendan wrote:Hash tables (if modulo is overkill)?
How about, to avoid run-time checks, internally the compiler converts "int foo(int value)" to "int_with_standard_range foo(int_that_accepts_values_1to3_45_23_1983_6to15_4321to4444_elephant_and_69)", in a similar way to the name mangling C++ does to ensure type correctness during linking?
You're right - to avoid confusion I should've said "pure functional programming" instead.Rusky wrote:... and there you go again. ML does, in fact, allow mutable state. In addition, one of the papers I linked to applies dependent types to a regular old imperative language. And further, if that's your attitude toward functional programming, I'm sorry.Brendan wrote:"Rain increases the chance of car accidents. To improve safety, I invented a "functional car" - as soon as it detects the smallest drop of water it jams on the breaks, turns of the engine and refuses to move. People laugh at my "functional car" and call it funny names (like "non-functioning car"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."
"State increases the chance of programming mistakes. To improve "correctness", I invented "functional programming" - as soon as it detects the smallest amount of state, it spits out an error message and refuses to compile. People laugh at my "functional programming" and call it funny names (like "non-functioning programming"). I've been pushing the idea for over 50 years and still can't understand why nobody wants it."
In the same way, I should've said "pure functional car", so people don't get confused and think I'm talking about cars that were wrapped in plastic to work-around the idiocy of the primary design goal (preventing people from driving in the rain).
The third paper applies dependent types to some contrived language they had to invent specifically for the purpose - it's hardly "regular" or "old".
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Pitfalls of lacking a MMU
You both seem to have a problem:
Haskell for example has a special "monad" object called IO that indicates system state.
As for array indices, it is possible as per the proof I posted earlier that you can generate the least amount of checks needed for any terminating algorithm. It is however completely infeasible to do so as it has not yet turned out to be achievable in polynomial time. Therefore compilers use proving techniques (with heuristics to find correctness proofs) to eliminate many access checks. If you have loops over arrays you can for instance check that the index is bounded by zero and the array size, and that the array size does not change during the course of the loop. This becomes more and more difficult as the amount of code in the loop changes and becomes impossible if the effects of called code on any argument becomes unavailable. Whole-program optimizers can pull the boundary of knowledge over the file border and make completions of correctness proofs more feasible.
But in general, compiler-generated access checking is not efficient at all for any sufficiently complex code.
I should've said "pure functional programming" instead.
Every single functional language allows state, which works as follows:ML does, in fact, allow mutable state
Code: Select all
typedef struct
{
state a;
int return_1;
int return_2;
} reply_data;
reply_data pure_functional_function(const state s, const int arg_1, const int arg_2);
As for array indices, it is possible as per the proof I posted earlier that you can generate the least amount of checks needed for any terminating algorithm. It is however completely infeasible to do so as it has not yet turned out to be achievable in polynomial time. Therefore compilers use proving techniques (with heuristics to find correctness proofs) to eliminate many access checks. If you have loops over arrays you can for instance check that the index is bounded by zero and the array size, and that the array size does not change during the course of the loop. This becomes more and more difficult as the amount of code in the loop changes and becomes impossible if the effects of called code on any argument becomes unavailable. Whole-program optimizers can pull the boundary of knowledge over the file border and make completions of correctness proofs more feasible.
But in general, compiler-generated access checking is not efficient at all for any sufficiently complex code.