embryo wrote:MessiahAndrw wrote:how do I tell the GC to not collect a white object that has been moved to a black object, mid-GC cycle?
I do not use terms like black or white. For me an object is reachable or not, and that's all possible states. According to the JVM specification every object must be placed on the bytecode stack before it can be used. Then I just intercept the placement action and mark the object as reachable. Does it became white or gray or black - is irrelevant for me, because I know - it shouldn't be collected.
'Grey' in the incremental GC papers I've read are aggregate types (arrays, objects) that are confirmed reachable, but haven't yet had their children scanned (they're added to a 'to scan' list.) 'Black' means the item is confirmed reachable, and we've already scanned all their children.
So anyway, my question with incremental garbage collection is, say we have this code:
Code: Select all
ObjectA = new Object();
ObjectB = new Object();
ObjectB.Child = new Object(); // add child object
// incremental GC gets called here
ObjectA.Child = ObjectB.Child; // move child object from B to A
ObjectB.Child = null;
// incremental GC gets called here
We have 3 objects. We assign a 'Child' object to ObjectB. Then the garbage collector runs and performs a 'mark'. However, it's an incremental garbage collector, so rather than marking everything at once recursively (because this could potentially be a very large program) we incrementally mark.
On the first GC call, ObjectA and ObjectB are considered 'unmarked'. We walk up the call stack, we notice that ObjectA and ObjectB are unmarked, so we mark them, and notice that they are aggregate types that could potentially have children, so we add them to a "to scan". We can ObjectA, mark each of it's children (which in this case are none), and remove it from ObjectA from the "to scan" list. Our incremental GC considered enough time has run, so it returns control to the program, with ObjectB still sitting on the "to scan" list.
Back in our code, we move our "Child" object from ObjectB (on the to scan list) to ObjectA (already scanned).
Our GC is called again. We walk up the call stack, notice that ObjectA has already been marked (this is an incremental GC, we don't restart from scratch every time the GC is called, otherwise it wouldn't be incremental), and ObjectB is already on the to scan list, so we walk the 'to scan' list. We mark each of ObjectB's children, however it no longer has any child objects because we assigned ObjectB.Child to null.
And so our 'Child' object is never marked, and is collected, because we assigned it to an object that was already 'marked' and not scanned again in an incremental GC until the GC collector has finished collecting and begins the next cycle, in which case it's too late because our 'Child' has already been assumed to be garbage and cleaned up!
The only solution I've found online is to have a write barrier (put it in my array/object/closure's assignment instruction) that says "if my parent is marked, also mark me, so the GC doesn't overlook me".
There is a case where garbage isn't collected, for example:
Code: Select all
ObjectA = new Object();
ObjectB = new Object();
ObjectB.Child = new Object(); // add child object
// incremental GC gets called here
ObjectA.Child = new Object(); // SPECIAL CASE
ObjectA.Child = ObjectB.Child; // move child object from B to A
ObjectB.Child = null;
// incremental GC gets called here
Between GC cycles, where ObjectA is already marked, we assign it a temporary object (our write barrier noticed it has already ObjectA is marked, so also marks the temporary object). Then, we overwrite the property with another object. We end up with a temporary object that is marked, but is also garbage.
We will have to wait until the next pass to collect that temporary garbage. I don't think this is actually a _bad thing_ because if the incremental GC continues to run at frequent intervals, it will eventually detect that.
embryo wrote:MessiahAndrw wrote:My OS's 'native' executable format is bytecode (
example assembly) - I developed a high level language that compiles down into this bytecode (
example). The OS is essentially a bare metal virtual machine.
Why have you invented your version of a bytecode? Is there some pros and cons?
For the same reason we're all writing operating systems -for fun and learning.
embryo wrote:And the language looks like JavaScript. Or is it different? What is the difference?
Yes - it does, and at the basic level it's a curly-brace language with dynamic objects, first class functions, and closures. My language lacks JS features like prototypes and exceptions, adds features like memory buffers as native types, and eventually I want to add coroutines, threads (JS is not multithreaded), metaproperties (LUA style metatables), static types (so you can declare 'unsigned a;' instead of 'var a;' to help with eventual JIT compilation), interfaces, lazy evaluation, and structures. What you see now is the core get-something-running language.