Kernel level garbage-collection discussion
Re: Kernel level garbage-collection discussion
Just remember, that when GC is running all other threads are stopped.
If you put GC at the lowest level, this means that for running a GC loop you will have to stop all the threads on the machine in entirety.
If you put GC at the lowest level, this means that for running a GC loop you will have to stop all the threads on the machine in entirety.
Learn to read.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Kernel level garbage-collection discussion
GC usually also implicitly mean that you have some functionality that move objects in memory in order to reduce fragmentation. Typically your Java runtime has something like this. This is however not mandatory for a GC system.tjmonk15 wrote:bluemoon and OSWhatever: Locks aren't really problem in terms of GC due to the fact the the GC should not need locks period. It simply "loops" over all "objects" and determines if there are any references to it. If the reference count changes mid check, it is likely decreasing by 1, in which case it will be collected next check (if it reached 0), or it is increasing by 1 (where the count was already >= 1) and shouldn't be collected anyways.
As long as at object creation the reference count is set to 1 before being added to the GC Manager, and following those assumptions above, you will never need a lock.
There are, of course, edge cases to consider, such as objects that shouldn't be GC'd, because they are found/modified using pointer arithmetic (page tables? ) and there should be a way in the memory manager to request non-GC'd memory for this purpose.
- Monk
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Kernel level garbage-collection discussion
Only with non-concurrent collector designs. Concurrent collector designs have existed for years (and both Java and .net use them)dozniak wrote:Just remember, that when GC is running all other threads are stopped..
Re: Kernel level garbage-collection discussion
As far as I understand .net concurrent collector, it uses N (where N usually equals number of processor cores) garbage collector threads to compact things faster, while all other threads are frozen waiting for GC to complete. Not sure about Java ones.Owen wrote:Only with non-concurrent collector designs. Concurrent collector designs have existed for years (and both Java and .net use them)dozniak wrote:Just remember, that when GC is running all other threads are stopped..
Learn to read.
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Kernel level garbage-collection discussion
For an OS level, instead of a VM level, GC solution, putting this in an idle thread and allowing it to be interupted/pre-empted would make it concurrent I guess? But really, it wouldn't matter. Especially, when you make the assumptions you can when you are at the OS level instead on at the VM(user-space) level.Owen wrote:Only with non-concurrent collector designs. Concurrent collector designs have existed for years (and both Java and .net use them)dozniak wrote:Just remember, that when GC is running all other threads are stopped..
As you point out, this is not a required part of a GC implementation, and I think more importantly, that is it part of a VM implementation (not GC at all.) And in this case you can not make the same assumptions you can about VM/fragmentation routines, that you can about GC routines. The important part, that I see anyways, is segregating your code properly, so that you can make as many assumptions as possible (separation of concerns/responsibilities) such that your code can be more performant.OSwhatever wrote:GC usually also implicitly mean that you have some functionality that move objects in memory in order to reduce fragmentation. Typically your Java run-time has something like this. This is however not mandatory for a GC system.tjmonk15 wrote:bluemoon and OSWhatever: Locks aren't really problem in terms of GC due to the fact the the GC should not need locks period. It simply "loops" over all "objects" and determines if there are any references to it. If the reference count changes mid check, it is likely decreasing by 1, in which case it will be collected next check (if it reached 0), or it is increasing by 1 (where the count was already >= 1) and shouldn't be collected anyways.
As long as at object creation the reference count is set to 1 before being added to the GC Manager, and following those assumptions above, you will never need a lock.
There are, of course, edge cases to consider, such as objects that shouldn't be GC'd, because they are found/modified using pointer arithmetic (page tables? ) and there should be a way in the memory manager to request non-GC'd memory for this purpose.
- Monk
- Monk
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Kernel level garbage-collection discussion
If there is no memory fragmentation involved, then I have some sort of GC system in my kernel. First of all, many objects are reference counted and when the reference reaches zero it is moved to list that a low prio thread is picking from and deletes the objects. The reference counting is not for the purpose of GC but that ensuring that there are no references to the object before deleting it. In an SMP system, reference counting is many times necessary as if you want to delete an object on one CPU you have no idea if you really can do it because some other CPU might be playing with the object and this is conveniently solved with reference counting.tjmonk15 wrote:As you point out, this is not a required part of a GC implementation, and I think more importantly, that is it part of a VM implementation (not GC at all.) And in this case you can not make the same assumptions you can about VM/fragmentation routines, that you can about GC routines. The important part, that I see anyways, is segregating your code properly, so that you can make as many assumptions as possible (separation of concerns/responsibilities) such that your code can be more performant.
- Monk
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Kernel level garbage-collection discussion
I think you are confusing some ideas here. GC is all about reference counting, and automatically (without [delete *ptr] or similar {my c++ is a lil rusty with ptrs}) deleting objects. If you do not have GC you MUST [delete] objects your self. You can reference count yourself and then delete when it reaches 0, but the idea of GC is that some other process/thread/mechanism does this for you.OSwhatever wrote:If there is no memory fragmentation involved, then I have some sort of GC system in my kernel. First of all, many objects are reference counted and when the reference reaches zero it is moved to list that a low prio thread is picking from and deletes the objects. The reference counting is not for the purpose of GC but that ensuring that there are no references to the object before deleting it. In an SMP system, reference counting is many times necessary as if you want to delete an object on one CPU you have no idea if you really can do it because some other CPU might be playing with the object and this is conveniently solved with reference counting.tjmonk15 wrote:As you point out, this is not a required part of a GC implementation, and I think more importantly, that is it part of a VM implementation (not GC at all.) And in this case you can not make the same assumptions you can about VM/fragmentation routines, that you can about GC routines. The important part, that I see anyways, is segregating your code properly, so that you can make as many assumptions as possible (separation of concerns/responsibilities) such that your code can be more performant.
- Monk
By what we both said above, I would agree that you have GC in your kernal. But I fail to see what fragmentation has to do with this? TBH, I fail to see the benefit of "fixing" (memory) fragmentation of a process that is running. To me, relocating objects in memory would break pointer's abound, and may cause some "nifty"/fast pointer math to fail. Unles you do it in physical memory and update virtual memory accordingly. But since ALL memory access go through the page tables/cache I think you are more likely to shoot yourself in the foot than cause a speed-up, no?
I may be way off base with that previous statement, but as I think about it logically, my statement makes sense, if you think about how things work at a low level. Please, anyone here, correct me if I'm wrong.
- Monk
Re: Kernel level garbage-collection discussion
My kernel actually has something remotely similar to a garbage collector. It less of a collector, more like insurance against memory leaks in the for a task-specific table of allocated memory. When a call to malloc() or similar is made within the space of a Task, a reference to that memory is save in the table. When a call to free() is made, the reference is removed.
Once the task is finished, cleanup runs and everything left in the table is deallocated, followed by the table itself. It seems to work just fine, and in some sense, does qualify as garbage collection, in that it prevent garbage from hanging around in memory. It doesn't add all too much overhead, since every entry is 4 only bytes (the size of a pointer). I'm pretty sure though that this is probably something that isn't all too uncommon.
My 2 cents.
Once the task is finished, cleanup runs and everything left in the table is deallocated, followed by the table itself. It seems to work just fine, and in some sense, does qualify as garbage collection, in that it prevent garbage from hanging around in memory. It doesn't add all too much overhead, since every entry is 4 only bytes (the size of a pointer). I'm pretty sure though that this is probably something that isn't all too uncommon.
My 2 cents.
Re: Kernel level garbage-collection discussion
Some garbage collectors do use compaction of live objects. Remember not all languages allow pointers and pointer arithmetic. The benefit is that it is trivially easy to allocate memory from a compacted memory pool. ( pret = p; p += size; return pret; )tjmonk15 wrote:I think you are confusing some ideas here. GC is all about reference counting, and automatically (without [delete *ptr] or similar {my c++ is a lil rusty with ptrs}) deleting objects. If you do not have GC you MUST [delete] objects your self. You can reference count yourself and then delete when it reaches 0, but the idea of GC is that some other process/thread/mechanism does this for you.OSwhatever wrote:If there is no memory fragmentation involved, then I have some sort of GC system in my kernel. First of all, many objects are reference counted and when the reference reaches zero it is moved to list that a low prio thread is picking from and deletes the objects. The reference counting is not for the purpose of GC but that ensuring that there are no references to the object before deleting it. In an SMP system, reference counting is many times necessary as if you want to delete an object on one CPU you have no idea if you really can do it because some other CPU might be playing with the object and this is conveniently solved with reference counting.tjmonk15 wrote:As you point out, this is not a required part of a GC implementation, and I think more importantly, that is it part of a VM implementation (not GC at all.) And in this case you can not make the same assumptions you can about VM/fragmentation routines, that you can about GC routines. The important part, that I see anyways, is segregating your code properly, so that you can make as many assumptions as possible (separation of concerns/responsibilities) such that your code can be more performant.
- Monk
By what we both said above, I would agree that you have GC in your kernal. But I fail to see what fragmentation has to do with this? TBH, I fail to see the benefit of "fixing" (memory) fragmentation of a process that is running. To me, relocating objects in memory would break pointer's abound, and may cause some "nifty"/fast pointer math to fail. Unles you do it in physical memory and update virtual memory accordingly. But since ALL memory access go through the page tables/cache I think you are more likely to shoot yourself in the foot than cause a speed-up, no?
I may be way off base with that previous statement, but as I think about it logically, my statement makes sense, if you think about how things work at a low level. Please, anyone here, correct me if I'm wrong.
- Monk
If a trainstation is where trains stop, what is a workstation ?
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Kernel level garbage-collection discussion
When a process is ended (either a process initiated quit/stop/end message, or a user initiated quit) all of its address space (in a paged protected mode setup) is invalidated and therefore free for use, without any extra steps.greyOne wrote:My kernel actually has something remotely similar to a garbage collector. It less of a collector, more like insurance against memory leaks in the for a task-specific table of allocated memory. When a call to malloc() or similar is made within the space of a Task, a reference to that memory is save in the table. When a call to free() is made, the reference is removed.
Once the task is finished, cleanup runs and everything left in the table is deallocated, followed by the table itself. It seems to work just fine, and in some sense, does qualify as garbage collection, in that it prevent garbage from hanging around in memory. It doesn't add all too much overhead, since every entry is 4 only bytes (the size of a pointer). I'm pretty sure though that this is probably something that isn't all too uncommon.
My 2 cents.
The main way to solve a memory leak as a user on windows is to kill the task and restart it, see above for why this works. This is NOT GC. GC runs while any other/all other processes are running, and collects memory that is not being used and frees it as it can.
- Monk
Last edited by FallenAvatar on Thu Mar 21, 2013 6:06 pm, edited 1 time in total.
Re: Kernel level garbage-collection discussion
When a process ends, sure.tjmonk15 wrote: When a process is ended (either a process initiated quit/stop/end message, and a user initiated quit) all of its address space (in a paged protected mode setup) is invalidated and therefore free for use, without any extra steps.
The main way to solve a memory leak as a user on windows is to kill the task and restart it, see above for why this works. This is NOT GC. GC runs while any other/all other processes are running, and collects memory that is not being used and frees it as it can.
- Monk
However threads share their parent process's address space, so my point stands. (At least, they do in my implementation.)
They do however have their own stack.
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Kernel level garbage-collection discussion
I don't think a thread, and all of it's objects, should be collected when a thread ends. It should be perfectly valid for a thread to do some work on input (process variables) and provide any size of output (thread variables) and have that information stay resident/available to the parent Process.greyOne wrote:When a process ends, sure.tjmonk15 wrote: When a process is ended (either a process initiated quit/stop/end message, and a user initiated quit) all of its address space (in a paged protected mode setup) is invalidated and therefore free for use, without any extra steps.
The main way to solve a memory leak as a user on windows is to kill the task and restart it, see above for why this works. This is NOT GC. GC runs while any other/all other processes are running, and collects memory that is not being used and frees it as it can.
- Monk
However threads share their parent process's address space, so my point stands. (At least, they do in my implementation.)
They do however have their own stack.
What I suggested should only happen on Process end (the above quote and my previous point alludes to a combination of system and user input instead of either or, I have corrected it.) And on thread end would be a perfect importunity to invoke the GC for a few ms.