OSDev.org

Posted: **Fri Feb 27, 2009 2:58 pm**

Well, my argument is not precisely meant to be limited to kernels only -- and when I talk about "crashes" I'm not exactly limiting that to complete system crashes. In this case, I'm generalizing to say that when you mix the address space of some SIP "user space" processes, then you have created a situation where a hardware fault in one process can trash all the mixed processes at the same time ... and doesn't that basically defeat the purpose of having a managed-code system in the first place?

As far as hardware faults goes: A high radiation environment (space) will cause exponentially more hardware faults. As you said in your response to my original post (and I agree with) as the CPUs get smaller, faster, and more numerous, faults will increase exponentially. Not all hardware faults are fatal -- some are "benign" (the memory gets overwritten), some cause buggy app operation, some cause unexpected app termination, and a small percentage cause a system crash.

A linux system will crash every 6 months -- how many of those crashes do you think might be hardware related ... and not software at all?

Posted: **Fri Feb 27, 2009 10:55 pm**

Hi,

I've seen statistics (from someone running a large number of servers using ECC RAM) quoted as "1 error per GiB per month" combined with the suggestion that this error rate remains relatively constant regardless of RAM type and speed (smaller RAM chips have less target area and a higher chance of being effected if the target area is hit with an alpha particle). I assume that this applies to RAM without ECC too.

bewing wrote:A linux system will crash every 6 months -- how many of those crashes do you think might be hardware related ... and not software at all?

My server has 8 GiB of RAM and no ECC and runs 24 hours per day, however 65% of RAM is normally free/unused. If "1 error per GiB per month" is close to correct then on average I'd be expecting "in use" RAM to be effected every 10.87 days. In the last 6 months X has crashed once and Seamonkey crashed about 4 times. I think Seamonkey crashes due to software bugs as it seems to crash under similar situations each time. I don't know why X crashed.

It would be interesting to see the results of a study on the impact of RAM errors. For example, if a random bit in all of memory is changed, what chance is there that it'll cause a system crash, what chance is there it'll cause a process crash, what chance is there it'll effect nothing, etc?

The best way to do this sort of testing is with an emulator. For example, Bochs could be modified so that every N seconds it chooses a random bit in RAM and changes it's state. Of course emulators could be used to inject a large range of faults into the guest (transient RAM errors, permanent RAM faults, disk read errors, network packet corruption, etc), and then used for very extensive fault tolerance studies...

Cheers,

Brendan

Posted: **Sat Feb 28, 2009 3:17 am**

bewing wrote:In this case, I'm generalizing to say that when you mix the address space of some SIP "user space" processes, then you have created a situation where a hardware fault in one process can trash all the mixed processes at the same time ... and doesn't that basically defeat the purpose of having a managed-code system in the first place?

My counterpoint was, this is no different than many apps with plug-in architectures today. Actually, it's better, because it removes the possibility of software failures in a plug-in crashing the app. And, at least for now, software failures are much more common than hardware failures.

The bottom line is that software needs to be more fault-tolerant to deal with the coming hardware realities. Avoiding shared state by having a sealed-process architecture based on message passing makes this much easier, regardless of whether HIPs or SIPs are used. Killing and re-starting a process has far fewer side-effects in such a system.

Posted: **Sat Feb 28, 2009 9:24 am**

Brendan wrote: Of course emulators could be used to inject a large range of faults into the guest (transient RAM errors, permanent RAM faults, disk read errors, network packet corruption, etc), and then used for very extensive fault tolerance studies...

Well, if some people want such features within a reasonable (ie. geological) timeframe, they may have to help me code them in .... The disassembler is rewritten now, and I'm starting work on devices. Then, alpha testing!

quoted as "1 error per GiB per month"

Which probably should be translated as ".05 (or .01) errors per billion transistors per month", and then applied to the CPUs also -- with the caveat that he's almost certainly in a low-radiation environment.

Posted: **Thu Apr 30, 2009 7:19 am**

Regarding the Singularity benchmarks in the next paragraph they state its prob close to worse case . I also note in future OS 's may have to deal with much larger amounts of IPC due to SOA/Services/Cloud computing. And you could even argue in cloud service cluster servers and browser sandbox clients reliability is even less of an issue.

Try quoting all relevant bits rather than out of context before you shoot down someones work

"
The WebFiles benchmark clearly demonstrates the unsafe code
tax, the overheads paid by every program running in a system
built for unsafe code. With the TLB turned on and a single
system-wide address space with 4KB pages, WebFiles
experiences an immediate 6.3% slowdown. Moving the client SIP
to a separate protection domain (still in ring 0) increases the
slowdown to 18.9%. Moving the client SIP to ring 3 increases the
slowdown to 33%. Finally, moving each of the three SIPs to a
separate ring 3 protection domain increases the slowdown to
37.7%. By comparison, the runtime overhead for safe code is
under 5% (measured by disabling generation of array bound and
other checks in the compiler).

The unsafe code tax experienced by WebFiles may be worst case.
Not all applications are as IPC intensive as WebFiles and few
operating systems are fully isolated, hardware-protected
microkernels. However, almost all systems in use today
experience the overheads of running user processes in ring 3. In a
traditional hardware-protected OS, every single process pays the
unsafe code tax whether it contains safe or unsafe code. SIPs
provide the option of forcing only unsafe programs to pay the
unsafe code tax.
"

CPU intensive benchmarks are meaningless unless they use OS calls ( which the papers show are MUCH cheaper with Singularity) . Memory benchmarks are prob useful . There is also charts comparing the cost for API calls , WIndows creation etc to Windows , Linux etc. There are many other benchmarks in the other 6 papers and 30 design documents released with the distribution showing memory usage etc.

And honestly Singularity has very few optimizations and a real OS based on it would perform better , things like Multi Core came much later. To compare a research OS like Minix 2 or Sing to a real optomized OS is not apples and apples either.

I suspect there are a number of papers that were released internally but not publically ( eg in 2006 they mentioned they were all going to run Sing to run home entertainment systems at home to see how reliable it is in practice but there is no paper discussing real life reliability) . There are many ideas in Sing which will not see the light of day , and its worth stating the researches dont really know either else they would not have needed to build paging etc for comparison. Some ideas worth looking at are why is the MM is the Kernel , you could have it talking to the HAL and make it a SIP , there security policy system is very heavy etc etc.

With regard to paging i think
1) 4K pages are too much overhead for most applications these days . Singularity in the test uses 4k pages.
2) Virtual memory is terrible in practice. Its an order of magnitude quicker to restart most application again than to restart it after it has memory stolen away from it and besides memory is cheap by the time these OS hit the street we are all looking at 8Gig + systems. Bad blocks or disk drivers can have really nasty impact on a machine and further more you need tight coupling between the disk driver and mm.
3) There are many ways of dealing with memory fragmentation.

Posted: **Thu Apr 30, 2009 2:46 pm**

bewing wrote:
quoted as "1 error per GiB per month"
Which probably should be translated as ".05 (or .01) errors per billion transistors per month", and then applied to the CPUs also -- with the caveat that he's almost certainly in a low-radiation environment.

No it couldn't. Not all transistors or ICs are created equal.

If it hits one of the MOSFETs involved in the CPU's power regulation, it will probably do bugger all (They're huge, saturated and well driven). If it hits some CPU cache or a register, it will do bugger all, since SRAM cells are pretty much invulnerable to radiation [in the short term]. The majority of the CPU doesn't have the one property that DRAM has: Reads are destructive.

An alpha strike on DRAM causes the cell transistor to momentaily activate and allow the charge to flow out of it. This construct is not found inside processors for a variety of reasons - for a start, you need to regularly refresh DRAM. You also need a silicon process specialized to it. Processors tend to use 6 transistor SRAM (Well, actually, the transistor count is often higher because the average processor needs multiple ports to it's cache and registers).

Radiation does affect all intergrated circuits in the form long term damage rather than short term malfunctions. Space ICs are very similar to normal ones - just made in special processes like Silicon on Saphire, and avoiding problematic constructs like DRAM cells.

Posted: **Fri May 01, 2009 2:47 am**

Hi,

Benk wrote:CPU intensive benchmarks are meaningless unless they use OS calls ( which the papers show are MUCH cheaper with Singularity) .

Wrong. CPU intensive code running under software isolation (with no API calls) suffers additional overhead (caused by things like array bounds checking, etc). The paragraph from the Singularity paper that you quoted mentions this ("the runtime overhead for safe code is under 5%").

For CPU intensive workloads software isolation has more overhead, and for "IPC ping-pong" workloads hardware isolation has more overhead. I'm just saying it'd make sense to do a "IPC ping-pong" benchmark *and* a "CPU intensive" benchmark, so that people don't get the wrong idea and think that software isolation is always gives better performance.

Of course you'd think people reading research papers would be smart enough to read between the lines; but obviously this isn't the case...

Benk wrote:Memory benchmarks are prob useful . There is also charts comparing the cost for API calls , WIndows creation etc to Windows , Linux etc. There are many other benchmarks in the other 6 papers and 30 design documents released with the distribution showing memory usage etc.

I think I've seen 2 papers and a few videos (for e.g.), and none of the design documents.

Benk wrote:With regard to paging i think
1) 4K pages are too much overhead for most applications these days . Singularity in the test uses 4k pages.

The problem with large pages is there's wastage - if you're using 4 KiB pages and only need 1 KiB of RAM then you have to waste the other 3 KiB of RAM; and if you're using 1 GiB pages the OS will run out of RAM before it can display a login screen (and then it'll need to swap entire 1 GiB chunks to disk).

Benk wrote:2) Virtual memory is terrible in practice. Its an order of magnitude quicker to restart most application again than to restart it after it has memory stolen away from it and besides memory is cheap by the time these OS hit the street we are all looking at 8Gig + systems.

Its not an order of magnitude quicker to restart most applications - are you sure you know how virtual memory works?

By the time we're looking at 8 GiB+ systems we'll also be looking at 8 GiB+ applications (or at least 2 GiB applications running on 3 GiB OSs with 2 GiB of file cache and 1 GiB of video data). At the moment DDR2 is cheap, but DDR3 isn't, and newer computers need DDR3.

Benk wrote:Bad blocks or disk drivers can have really nasty impact on a machine and further more you need tight coupling between the disk driver and mm.

If the OS runs out of RAM, then what do you suggest should happen:

Do a kernel panic or "blue screen of death" and shutdown the OS
Terminate processes to free up RAM
Let processes fail because they can't get the RAM they need, and let the process terminate itself
Everyone doubles the amount of RAM installed every time they run out of RAM (to avoid random denial of service problems caused by the first 3 options), even if they only run out of RAM for once in 5 years
Allow data to be paged to/from disk

Benk wrote:3) There are many ways of dealing with memory fragmentation.

Sure, and they all suck (but paging sucks less).

Cheers,

Brendan

Posted: **Fri May 01, 2009 11:05 am**

Paging in some cases offers better performance than all apps sharing one heap because CAS and RAS strobes are reduced significantly. As an example, on DDR3, RAS strobes take 27 cycles. Thats 27 cycles BEFORE the fact that it's DDR is taken into account. One eats 54 transfers, or enough time to transfer 864 bytes of data. The reason they're reduced? Because page fragmentation tends to coincide with where RAS strobes would be needed anyway. This is quite important when you consider that RAS and CAS strobes aren't getting any faster - they take pretty much the same time as they did 12 years ago - while bandwidth has increased 32x.

Posted: **Fri May 01, 2009 10:30 pm**

Owen wrote:Paging in some cases offers better performance than all apps sharing one heap because CAS and RAS strobes are reduced significantly. As an example, on DDR3, RAS strobes take 27 cycles. Thats 27 cycles BEFORE the fact that it's DDR is taken into account. One eats 54 transfers, or enough time to transfer 864 bytes of data. The reason they're reduced? Because page fragmentation tends to coincide with where RAS strobes would be needed anyway. This is quite important when you consider that RAS and CAS strobes aren't getting any faster - they take pretty much the same time as they did 12 years ago - while bandwidth has increased 32x.

Interesting... Do you have a background in computer engineering? It's cool getting a deeper hardware perspective on things...

Although, it should be pointed out that in Singularity each process has its own heap, its own garbage collector, etc. The only heap shared by all processes is the exchange heap which is used for IPC.

Posted: **Sat May 02, 2009 4:37 am**

A formal one, no, but it's an area I am very interested in. One of my hobbies is work with FPGAs, and when you pair an FPGA with SDRAM, you get to know these kinds of things

Posted: **Sat May 02, 2009 7:16 am**

Of course that Owen is completely wrong in his statements ...
The question is: are you capable to understand why?

Posted: **Sat May 02, 2009 10:53 am**

Now that it's morning, something doesn't look quite right.

For one thing, internal fragmentation of pages is actually pretty rare compared to heap fragmentation, which would be a lot less regular (e.g. -- instead of every 4K, there could be "holes" scattered all over).

Also, doesn't this just amount to saying that accessing DRAM is a lot slower than not accessing DRAM? In other words "fragmentation is good because we won't be wasting time reading from those addresses" (which is like saying doing my taxes is good because I wouldn't waste the time watching TV instead).

Posted: **Sat May 02, 2009 3:01 pm**

Paging means that an application's data is spread over less rows. The result is that, for applications which access very large data sets, the number of rows accessed in doing so are lower. Combine this with the fact that cache controllers generally don't load from pages which aren't mapped anywhere [for obvious reasons], and the processor spends less of it's bandwidth hopping around through a fragmented heap. Because each row is all one application's data [excluding any fragmentation], cache hit rates go up also, in particularly in conjunction with large caches.

The whole benefit comes from the fact that the processor spends less time waiting for the RAM state machine to open rows and spends more of it bursting data in. A problem which is getting worse as bandwidth increases and memory cell speeds do not.

Posted: **Sun May 03, 2009 12:53 am**

We should for a new thread for paging and Singularity . It is interesting , i not the Singularity guys arent sure either they are hedging their best and saying this is interesting.

Anyway paging is ok ( not good) provided your page is in the TLB. Bigger pages help here , but most people still use 4K pages. With increasing context switching for newer GP OS and micro kernel designs this will become more of an issue. Think cloud computng and 100's of services. A lot of OS's flush the TLB on a context switch.

Since all references are managed ( ie indirect) the GC ( including the OS one) can just repack the memory . It happens all the time with the Java VM and the .Net run time. The question is can you repack it efficiently without using VM ? Or can you have this VM without paging , people used to do it in the past.

Regards,

Ben

Posted: **Sun May 03, 2009 12:55 am**

""Of course that Owen is completely wrong in his statements ...
The question is: are you capable to understand why?

Paging has little to do with memory if its not in the TLB your in trouble ( cant remember but something like 100 times as long) . If the page table is not in the CPU cache than your in even bigger trouble , large 4k page tables filling CPU cache is an issue these days.

OSDev.org

Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS

Re: Multi-core CPUS