If you are talking about memory failures, then yes, it's hard to defend against them. But there is a plenty of hardware beside of the memory. So, when someone unplugs the network cable we have (in case of good design) something like a layered exception handling, when first the driver sets it's output accordingly, next the tcp/ip implementation pops an exception to the upper level of an application, that is currently working with the network. And even if there is no exception handlers, defined by a programmer, even then the VM has an ability to safely kill just one thread without the whole application being affected (would it be the runtime checks or whatever, that the VM allows us to implement).Brendan wrote:Sure, every time you read a variable you'd check if it's the same value that was stored in the variable last "somehow", every time you add 2 numbers you follow that by a check (e.g. "c = a + b; if( c-a != b ) .."), every time you multiply you follow it with a check (e.g. "c = a * b; if(c / a != b) ..."), etc. Of course it's going to be much much slower, you won't be able to use 2 or more CPUs to spread the load, and it's still going to fail when (e.g.) the code itself is corrupted or (e.g.) the CPU fails or (e.g.) someone unplugs the wrong power or network cable.embryo2 wrote:It can make things like problem monitoring easier to implement. If we have a high level code (or bytecode) then it is possible to insert some checks, specific for particular hardware, when the code is being compiled.
Well, you can look at many web sites and it is possible to find there exactly the "extremely naive" situation. The server just works, the site just works, but one page is not displayed because of a bug.Brendan wrote:That's extremely naive at best. In practice those threads and will crash at "unfortunate" times (e.g. in the middle of modifying data while holding several locks) and you'll be screwed.embryo2 wrote:It helps in practice. If a handler has a bug, then it's execution is aborted with an exception and it's thread just stops running (if VM was designed by competent architects), but all other threads still work and application looses just some small part of it's functionality.
And of course, the lock keeping can be done by the VM by using thread id for determining what thread should be resumed after the problem thread crashed. If there is the concurrent data modification problem, then yes, the problem can spread to two threads, but still it is not the full application crash. But most often the concurrent data modification is not a problem because usually each handler modifies only it's personal data.
If a bug affects million of programs and nobody cares, then it seems to me that there is no bug at all.Brendan wrote:More like the opposite - every time they add new features to the VM they introduce more bugs that effect millions of programs.embryo2 wrote:but the VM's bugs affect millions of programs, so they can be detected very quickly. Then, despite of the bug complexity, the required improvement will be made in a short time.
We ensure that there is no race condition by designing the code, that works with private data, for example. And if we know that the code modifies the data, that is shared across many threads, then what for we need to compare the output of such threads? Should we wonder if there will be a difference or should we cry because there is no difference?Brendan wrote:First we ensure there's no race conditions "somehow" (with magic or prayer?); then we compare outputs of a program from different computers to both detect and avoid problems caused by race conditions (which makes it easy to detect and correct those race conditions that we "ensured" couldn't happen "somehow")?embryo2 wrote:And if you know about the problem, would you expect that a program with race condition will get the same result all the time? I think - no. So, first we should ensure there is no race condition and only after it we can compare outputs of a program from different computers.
Yes, it's better, but the development time cries for mercy.Brendan wrote:it's better to give users the ability to choose whether to use it or not (instead of simply assuming all the software that will ever be run on the OS will always be "not important enough").
The granularity of your "distribution" is much lower than it is for existing solutions. So, the overhead of your solution will be much bigger, than the overhead of the existing solutions (heavy networking because of string comparison calls, for example). Then why anybody need such replacement for existing solutions?Brendan wrote:Also note that I'm planning a distributed system. The chance of one computer failing might be "acceptably low"; but when you have 100 computers working together the chance that one of those computers will fail is 100 times higher than "acceptably low".
No, actually it is you, who need the synchronization et al on every call. But I never proposed the variant of low level function interaction using something like messaging. Instead, I prefer to look at coarse grained solutions, when the synchronization code takes just a tiny fraction of the function's execution time (like it is for web services and all the like).Brendan wrote:So now you need some sort of synchronisation point at the start and end of every function, plus some way to determine which functions use which data structures, and then you're still completely screwed if the function has some sort of main loop and you never leave that function (until/unless you exit the process). It's "simple" (like, winning the national lottery is simple - you just buy a ticket)!embryo2 wrote:Just let the running function to finish and prevent new calls from accessing the old function (by changing it's address). It's simple.
No, my box prevents access to the data, that was chosen by a developer to be inaccessible for a library. So, my box ensures security of the data in different way than your. But without any messaging for every call overhead.Brendan wrote:If the box excludes everything unnecessary (e.g. one "box"/virtual address space for the application and a separate "box"/virtual address space for the library); and if there's a way to transfer information between "boxes" (e.g. messages) then it'd work because it's what I'm doing. The only difference is that you're bloated/inefficient VMs to create the boxes
Well, then what about the "OS can touch everything" security problem? Why do you think that the latter is any better than the former? And if you think they are comparable, then why do you stress for the VM's problem only?Brendan wrote:which creates a whole new "VM can touch everything" security problem that's just as bad as the "library can touch everything" problem that you were trying to solve.