less painful memory management
Posted: Wed Oct 21, 2009 5:39 pm
This somewhat furthers the thread I made a couple of weeks ago about a low-level Lisp dialect.
I'm sure I'm not the only one who doesn't like manual memory management. Languages with garbage collection have been shown many times to improve productivity for that very reason. However, we in the osdev world can't easily construct a garbage collection system, because that would require language support, and a common language that has GC would force that requirement on everyone. Otherwise, we probably would all be using GC. But maybe low level memory management is not the problem. While trying to figure out a good way to do manual memory management in a Lisp-like language, I think I came up with a better and cleaner way of doing it in general.
The basic idea is this: memory management should not be thought of as a series of allocations and frees, but instead of the movement of a data structure from temporary to permanent position. If the structure is allocated on the stack, it is temporary, and does not need to be worried about for memory leaks; it will be freed when the function returns. If the structure is allocated in the heap, it is permanent, and can be moved out of functions without being freed. GC is just one step of abstraction above this model - temporary vs. permanent is decided automatically.
The design is composed of two complementary functions: loc() and glo() (short for localize and globalize).
The glo() function takes a pointer and a size, and returns a pointer to an area of memory on the heap that contains exactly what the pointer given to it did. If the pointer given is already in the heap, it is passed right back.
The loc() function takes a pointer and a size, and returns a pointer to an area of memory on the stack that contains exactly what the pointer given to it did. If the pointer given is in the heap, it is freed. If the pointer given is on the stack, it is still copied. loc() would probably be a macro that uses alloca() internally in C.
This in heap/not in heap detection requires some support from the libc or whatever implements the heap, but those extensions are very straightforward: only one range needs to be checked - double frees would not need to be detected. In the language I designed, pointers have implicit size attributes, which makes the syntax cleaner, but it could still be done in C quite well.
As you can see, there is a sort of conservation of information between these functions. The general pattern would be to allocate a structure as a local variable, initialize it, then set a pointer in the larger data structure to the glo() of it. Freeing doesn't have to be done after all manipulation either: when a structure is detached from a larger data structure with loc(), it can still be read and written before the function ends, which is imo more intuitive and less likely to cause an accidental leak than saving the pointer to free at the end.
Of course, this comes at the cost of a lot of memory copying, so it wouldn't be as fast as malloc() and free(), especially for large allocations - that's why it can't replace them entirely. However, I believe that this clean, intuitive, no-special-case design would be a reasonable advancement of the usual C WYSIWYG manual memory management. There is no need for additional library or language support, except a simple addition to the libc, to do it either.
Questions? Comments? Ideas for making this better?
I'm sure I'm not the only one who doesn't like manual memory management. Languages with garbage collection have been shown many times to improve productivity for that very reason. However, we in the osdev world can't easily construct a garbage collection system, because that would require language support, and a common language that has GC would force that requirement on everyone. Otherwise, we probably would all be using GC. But maybe low level memory management is not the problem. While trying to figure out a good way to do manual memory management in a Lisp-like language, I think I came up with a better and cleaner way of doing it in general.
The basic idea is this: memory management should not be thought of as a series of allocations and frees, but instead of the movement of a data structure from temporary to permanent position. If the structure is allocated on the stack, it is temporary, and does not need to be worried about for memory leaks; it will be freed when the function returns. If the structure is allocated in the heap, it is permanent, and can be moved out of functions without being freed. GC is just one step of abstraction above this model - temporary vs. permanent is decided automatically.
The design is composed of two complementary functions: loc() and glo() (short for localize and globalize).
The glo() function takes a pointer and a size, and returns a pointer to an area of memory on the heap that contains exactly what the pointer given to it did. If the pointer given is already in the heap, it is passed right back.
The loc() function takes a pointer and a size, and returns a pointer to an area of memory on the stack that contains exactly what the pointer given to it did. If the pointer given is in the heap, it is freed. If the pointer given is on the stack, it is still copied. loc() would probably be a macro that uses alloca() internally in C.
This in heap/not in heap detection requires some support from the libc or whatever implements the heap, but those extensions are very straightforward: only one range needs to be checked - double frees would not need to be detected. In the language I designed, pointers have implicit size attributes, which makes the syntax cleaner, but it could still be done in C quite well.
As you can see, there is a sort of conservation of information between these functions. The general pattern would be to allocate a structure as a local variable, initialize it, then set a pointer in the larger data structure to the glo() of it. Freeing doesn't have to be done after all manipulation either: when a structure is detached from a larger data structure with loc(), it can still be read and written before the function ends, which is imo more intuitive and less likely to cause an accidental leak than saving the pointer to free at the end.
Of course, this comes at the cost of a lot of memory copying, so it wouldn't be as fast as malloc() and free(), especially for large allocations - that's why it can't replace them entirely. However, I believe that this clean, intuitive, no-special-case design would be a reasonable advancement of the usual C WYSIWYG manual memory management. There is no need for additional library or language support, except a simple addition to the libc, to do it either.
Questions? Comments? Ideas for making this better?