Request for testing a malloc..
Posted: Thu Mar 22, 2007 5:50 pm
Ok, so I wrote myself a new malloc, which I'll describe quickly, and then attach for testing. This is a very simple malloc right now, but my intention is to extend it to deal with the inefficiencies later. There are many, most of which have trivial solutions, but those would have complicated the basic algorithms a lot, so I opted for a minimal first version instead.
The basic idea is to have a chain of blocks, with "infoblocks" in between. Each info block knows how big the blocks before and after are. Heap begins with an infoblock <prev=0,next=foo> where foo is the size of the next allocated block. If the initial infoblock is at "first" then the next block is at:
second = address_of(start) + abs(start.next) + sizeof(infoblock)
There's an invariant which can catch trivial buffer overflows, currently checked by free:
second.prev = first.next.
If second is the last block, it's "next" will be zero.. if not, one can continue forward in the chain. The absolute value is because negative values are used for "used" blocks, while positive values are used for "free" blocks.
To allocate something, it basicly walks from the start until it finds a free block large enough. It then splits the block if the remaining part is large enough for the minimal allocation (which is 2-words right now). So if a free block is 8-words (=8*4 bytes on 32-bit) and alllocation was for 3-words (sizes aligned to words), then 3 words will be split, using 2 for the new infoblock in the middle, and leaving 3 words free block.
free() sanitychecks the "second.prev=first.next" invariant, merges blocks forward until it hits end of heap or an used block, and then merges blocks backwards until it hits start of heap or an used block.
realloc() is just a trivial wrapper for now, and does not attempt to extend a block in place even when it would be possible, and doesn't consider the old block free until after having copied the new block, which means it sometimes needs to grow heap even if it could have satisfied the request within current heap by being a bit more clever. Can be fixed later... calloc() as well is a trivial wrapper (which is fine)
The code is not thread safe or re-entrant, but compiling into a dynamic library and forcing it into programs with LD_PRELOAD should not crash any single-threaded (non-buggy) program.
Main inefficiency is that we always scan the whole heap O(n) in number of blocks before a sufficiently large free block, so performance is abysmal for programs that allocate huge amounts of small objects (certain text editors at least). This is the first thing I intend to fix (I have at least two strategies in mind).
Full list of library functions and operating system requirements as of now is:
There's wrappers in the code for morecore and panic for runnning in Linux which should probably work on most POSIX systems. All of that is in the beginning of the file.
I replaced Linux malloc (with LD_PRELOAD) for an xterm to test it, then running all kinds of basic stuff within the xterm, and it it seems to run just fine on my system (at least until you try to with a threaded process) but I'd like some of you to give it a shot.
You don't need to enable debugging to get the assertions (they are useful for users or the malloc as well) and enabling debugging will generate a HUGE flood of internal messages.
So... anyone with a nice operating system where plugging an alternative malloc either into kernel or into userspace isn't huge amount of work, please try running this, and tell if you see it crash. Extra credit for dumps of the last few screenfults of debug flood before a crash if you can reproduce it.
Like said, it's currently SLOW when one allocates lots of blocks, but...
License is MIT (as noted in the file).
The basic idea is to have a chain of blocks, with "infoblocks" in between. Each info block knows how big the blocks before and after are. Heap begins with an infoblock <prev=0,next=foo> where foo is the size of the next allocated block. If the initial infoblock is at "first" then the next block is at:
second = address_of(start) + abs(start.next) + sizeof(infoblock)
There's an invariant which can catch trivial buffer overflows, currently checked by free:
second.prev = first.next.
If second is the last block, it's "next" will be zero.. if not, one can continue forward in the chain. The absolute value is because negative values are used for "used" blocks, while positive values are used for "free" blocks.
To allocate something, it basicly walks from the start until it finds a free block large enough. It then splits the block if the remaining part is large enough for the minimal allocation (which is 2-words right now). So if a free block is 8-words (=8*4 bytes on 32-bit) and alllocation was for 3-words (sizes aligned to words), then 3 words will be split, using 2 for the new infoblock in the middle, and leaving 3 words free block.
free() sanitychecks the "second.prev=first.next" invariant, merges blocks forward until it hits end of heap or an used block, and then merges blocks backwards until it hits start of heap or an used block.
realloc() is just a trivial wrapper for now, and does not attempt to extend a block in place even when it would be possible, and doesn't consider the old block free until after having copied the new block, which means it sometimes needs to grow heap even if it could have satisfied the request within current heap by being a bit more clever. Can be fixed later... calloc() as well is a trivial wrapper (which is fine)
The code is not thread safe or re-entrant, but compiling into a dynamic library and forcing it into programs with LD_PRELOAD should not crash any single-threaded (non-buggy) program.
Main inefficiency is that we always scan the whole heap O(n) in number of blocks before a sufficiently large free block, so performance is abysmal for programs that allocate huge amounts of small objects (certain text editors at least). This is the first thing I intend to fix (I have at least two strategies in mind).
Full list of library functions and operating system requirements as of now is:
Code: Select all
- memset / memmove for calloc/realloc
adding trivial "static inline" version in the file if missing these
- printf() for debugging output
please note stuff will fail if printf calls malloc
only relevant when compiled with debugging (see below)
- panic(char*string) for killing process if assertions fail
"puts(string);exit(1);" is what it use as a wrapper by default
- morecore() function for well.. more core
for morecore, it currently panics "gracefully" if morecore returns
a block which is not directly after the end of heap (fix in future..)
so if you get message about continuous heap, that's the deal..
I replaced Linux malloc (with LD_PRELOAD) for an xterm to test it, then running all kinds of basic stuff within the xterm, and it it seems to run just fine on my system (at least until you try to with a threaded process) but I'd like some of you to give it a shot.
You don't need to enable debugging to get the assertions (they are useful for users or the malloc as well) and enabling debugging will generate a HUGE flood of internal messages.
So... anyone with a nice operating system where plugging an alternative malloc either into kernel or into userspace isn't huge amount of work, please try running this, and tell if you see it crash. Extra credit for dumps of the last few screenfults of debug flood before a crash if you can reproduce it.
Like said, it's currently SLOW when one allocates lots of blocks, but...
License is MIT (as noted in the file).