I finally got around to cleaning up, fixing, and generally polishing my good (not-so-)old malloc, so it's time to post a new version here, if only because I posted an early version earlier, and the current one is orders of magnitude better.
This is not the best malloc around. But it's not the worst either. It has some nice properties, being reasonably fast and reasonably space efficient, in almost all cases detecting any heap corruptions (buffer overflows, double frees) instantly, doesn't need monopoly over heap, and needs very little library support - for normal compile, all you need is morecore()/memset()/memcpy() and a panic(). Plus it's pretty simple.
When I say reasonably efficient, we are talking about full Voix kernel build needing about 10% more time, and a running Gnome-Vim taking more or less the same amount of memory as glibc stock malloc. For thread-safety, add implementations for malloc_lock() and malloc_unlock() which are no-ops as distributed.
It does much more sanity checking than one would need to detect the most common cases of heap corruptions. It doesn't do strict overflow checking though, so in cases where malloc returns a block slightly larger than was requested, it won't notice if you write into the excess part (since that won't corrupt heap). Stricter checking could be added, but I haven't bothered.
It's first-fit which should minimize heap fragmentation in theory, especially as it always splits/coalesces where possible. In practice best-fit might be preferable, but doing efficient best-fit is somewhat trickier and I wanted to avoid complexity and it performs quite fine as is. Maybe later. Realloc is still just a wrapper around malloc/free, without any attempt at trying to resize in-place, which I'll probably add next time I can be bothered to touch this code.
Why would anyone want to use this? I don't know. Maybe somebody needs a "simple" malloc for something, or just wants to see how such a thing could be written. It's reasonably simple to understand. Haven't tested it on 64-bits (don't have any available right now) but in theory it should work. If not, please tell and I'll fix it.
There's some printfs doing debug output but none of them gets compiled in unless __MALLOC_DEBUG__ is defined, which is only really useful when working with the malloc code. The panics on the other hand should be left there, since it's game-over if panic() is ever called (well maybe not, double-free's could be ignored safely, but I like to panic early).
Finally, if somebody wants to use it, but the license (which is MIT) is unacceptable, please tell, and we'll think of something.
Voix malloc.c rev 381.
Voix malloc.c rev 381.
- Attachments
-
- malloc.c
- (15.46 KiB) Downloaded 90 times
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
I like the idea of allocated block size being negative.
Code: Select all
// this looks weird but allocated blocks size is negative so..
if(size + ib->sNext > 0) {
void * newblock = malloc(size);
memmove(newblock, p, -ib->sNext);
free(p);
p = newblock;
}
Yeah well, it doesn't matter so much now, but it was the easiest thing to have when I originally implemented it without a freelist. With allocated blocks being negative, I could scan over all the blocks with a simple condition "skip blocks that are too small" and any allocated block would always be too small (because it's size was negative). Plus it saves you from masking the status from lowest bits (I guess that's where one would normally keep it).
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
You know how some system calls need to manipulate the kernel's heap. For storage of little knick knacks such as lists and what not. I was wondering what you thought about just giving each thread in the system one page of heap for calls that pass into the kernel. Then let the heap expand if it needed to.
Of course I came to this conclusion because of resource contention during a system call. I might have more than one thread needing to allocate some dynamic memory and would rather not make them deal with mutual exclusion and have to sleep?
Of course I came to this conclusion because of resource contention during a system call. I might have more than one thread needing to allocate some dynamic memory and would rather not make them deal with mutual exclusion and have to sleep?
Nah, you don't need to sleep. For an uniprocessor, you just disable interrupts in malloc_lock, and re-enable them in malloc_unlock if you disabled them for locking (need to keep the state to allow calls from both interrupt-disabled and interrupt-enabled contexts). For SMP, you also do a spinlock (it's a busyloop so it's not blocking). Malloc/free will never block for anything, assuming ofcourse that morecore need not block. It also makes locking easy by always locking once on entry, and unlocking once on exit, so you can keep the stored interrupts state in a static variable (like is done in the __KERNEL__ kludges part for Voix).Kevin McGuire wrote:You know how some system calls need to manipulate the kernel's heap. For storage of little knick knacks such as lists and what not. I was wondering what you thought about just giving each thread in the system one page of heap for calls that pass into the kernel. Then let the heap expand if it needed to.
Of course I came to this conclusion because of resource contention during a system call. I might have more than one thread needing to allocate some dynamic memory and would rather not make them deal with mutual exclusion and have to sleep?
I use it in my kernel all the time even from interrupt handlers, and my kernel would panic on the second you tried to block on anything in an interrupt handler.
Ofcourse there's theoretically potential for resource contention if you have a lot of processors, so you could end up with lots of spinlock waiting. But I think a better solution then would be to get rid of those cases where malloc or free might run for too long.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.