Type safe Memory manager

dc0d32 · Post by **dc0d32** » Fri May 25, 2007 9:45 pm

Hi

Most of the memory managers in today's interpreted/jitted/AOTed languages like Java, .NET etc. are coded in native languages which are inherently type unsafe.

I was wondering if there is any way by which i can write a memory manager and garbage collector in a _completely_ type safe way?

I have banged my head for hours on this, and could not find a solution that is completely type safe - and hence verifiable and hence dependable.

i intend to reduce the % of trusted components from my implementation - as the general trend goes - more verifiable -> more dependable and secure(well not exactly, but kind of).

Colonel Kernel · Post by **Colonel Kernel** » Fri May 25, 2007 10:39 pm

Type-safe garbage collection is an area of active research. It is not achievable without creating a new language with a new type system that is specialized for the task at hand. The kind of logic proofs and math involved in this kind of research is way over my head -- it's something I would like to learn if I ever get my Master's degree.

If you're interested, there are a few research papers out there on the topic. Here is one of the first links I found:

http://www.cs.princeton.edu/sip/projects/garbage/

BTW, I think this belongs in "OS Design & Theory".

dc0d32 · Post by **dc0d32** » Sat May 26, 2007 12:27 am

Is that even math in that paper

That should be enough to keep me engaged for days.

dc0d32 · Post by **dc0d32** » Sun May 27, 2007 4:39 pm

what i understand from all the references, is that doing a type safe GC in Java as of now seems impossible, since we will have to make changes to java bytecode.

is that correct, or am i missing something?

Thanks

Colonel Kernel · Post by **Colonel Kernel** » Sun May 27, 2007 5:18 pm

prashant wrote:what i understand from all the references, is that doing a type safe GC in Java as of now seems impossible, since we will have to make changes to java bytecode.

is that correct, or am i missing something?

Thanks

I'm not sure -- I haven't had the time to wade through all the greek letters in those papers.

I'm guessing that changing the bytecode isn't really going to help. Instead, a new higher-level language is needed with a new type system that can express the abstract operations performed by a GC. I think the idea is that a compiler for this new language would produce code, as well as metadata that can be used as a proof that the code is type-safe according to the rules of the new language. Formal proofs are required to show that the language itself is "safe" in this sense.

Crazed123 · Post by **Crazed123** » Sun May 27, 2007 11:27 pm

What about LLVM?

Colonel Kernel · Post by **Colonel Kernel** » Sun May 27, 2007 11:48 pm

Crazed123 wrote:What about LLVM?

What about it? It's an intermediate instruction format. As I said, code by itself (whether bytecode, native code, or otherwise) isn't sufficient to prove whether a GC implementation is type-safe.

Crazed123 · Post by **Crazed123** » Mon May 28, 2007 8:53 pm

Right, I just pointed it out because they seem to put some kind of type information on their intermediate code. Forgive my ignorance, I haven't learned formal proofs of programs yet.

doob · Post by **doob** » Tue May 29, 2007 4:43 pm

You could try D http://www.digitalmars.com/d/index.html
http://en.wikipedia.org/wiki/D_programming_language

Crazed123 · Post by **Crazed123** » Tue May 29, 2007 5:38 pm

As much as I like D, Phobos and the language itself don't seem stable enough for use on bare metal.

Also, I don't see how its garbage collector (which you must disable at the bare-metal level) helps create a type-safe memory manager.

doob · Post by **doob** » Wed May 30, 2007 3:20 am

I don't know much about creating a memory manager or using a language on bare metal but I read a document about type safe os and it mentions a couple of things that is type safe and that D has.

http://svn.dsource.org/projects/titan/t ... SafeOS.pdf

binutils · Post by **binutils** » Wed May 30, 2007 4:30 am

Colonel Kernel wrote:Type-safe garbage collection is an area of active research. It is not achievable without creating a new language with a new type system that is specialized for the task at hand.

http://cyclone.thelanguage.org/

Colonel Kernel · Post by **Colonel Kernel** » Wed May 30, 2007 8:53 am

doob wrote:I don't know much about creating a memory manager or using a language on bare metal but I read a document about type safe os and it mentions a couple of things that is type safe and that D has.

http://svn.dsource.org/projects/titan/t ... SafeOS.pdf

I couldn't find any mention of D in that paper, nor of using Clay to implement type-safe memory management code.

binutils wrote:http://cyclone.thelanguage.org/

Thanks for the link -- that looks interesting, especially the part about regions.

It sounds like they were able to implement a simple stop & copy GC using Cyclone, but that it required a certain amount of hacking.

From here:

However what I did glean from the discussion was that Cyclone's straightforward memory leak detection algorithm doesn't lead to a small, well-defined interface. Case in point: in order to implement a garbage collector they had to extend the runtime. It was a small function (only 5 lines of code) but nevertheless it was not possible at the user level. In effect, they had to introduce a new proof rule into the type system. That's disturbing to me -- it means the type system is incomplete. And these are smart people, so if it wasn't complete to start with, that probably means it'll never be complete.

What's even more disturbing is that at the end of the talk, Norman Ramsey asked how they would implement a generational garbage collector. And the answer was, "We haven't figured out how to do that yet... it may require dependent types." Eeek! Suddenly I realized at a deep level that advanced type systems are equivalent to mathematical proofs, and that if you start trying to prove too much about such an unpredictable beast as a program, it will resist you with a vengence.

doob · Post by **doob** » Wed May 30, 2007 10:39 am

Colonel Kernel wrote:I couldn't find any mention of D in that paper, nor of using Clay to implement type-safe memory management code.

I never said it mentioned D, but i mentions a couple of type safe features that Clay has that (if understand them correct) D also has. I also never said it implement type safe memory management code, I thought if you had a type safe language you could write type safe memory management code.

Colonel Kernel · Post by **Colonel Kernel** » Wed May 30, 2007 2:23 pm

doob wrote:I thought if you had a type safe language you could write type safe memory management code.

Yes, if the type system is expressive enough, but for languages like Java, C#, and D, it's not.

Think of it like this -- type safety is really just being used a proxy for memory safety. What we really want to avoid at the end of the day is leaked memory, reading from or writing to memory that doesn't belong to our program, etc. All that type safety means is that the language enforces a certain kind of consistency -- any operation you can express must follow certain rules. Type systems in conventional GC'd languages are geared towards guaranteeing basic memory safety -- that is, if you follow the rules, your program is type-safe, and because the type system is designed properly, it is also memory-safe.

The problem with GCs and memory managers is that they do things to untyped memory all the time. There are relationships and characteristics of those memory regions manipulated by a GC or MM that are not possible to formalize using a conventional type system (as the OP has discovered).

Try writing a type-safe GC or memory manager and you'll see what the problem is, even if it's difficult to put into words.