Page 1 of 2

nedmalloc, again

Posted: Mon Nov 02, 2009 8:59 pm
by quanganht
Ok, so Ned advertised his malloc() is the fastest one on Earth. Then why there are so few people using it? Some sorts of impractical? Well, I will be very happy if someone can tell me how it works in general, or give a view from 10 miles above.

Re: nedmalloc, again

Posted: Mon Nov 02, 2009 9:06 pm
by AndrewAPrice
Good question.

Re: nedmalloc, again

Posted: Mon Nov 02, 2009 9:10 pm
by quanganht
MessiahAndrw wrote:Good question.
And where are the good answers ? :)

Re: nedmalloc, again

Posted: Tue Nov 03, 2009 2:03 am
by Combuster
from his own website:
ptmalloc3 currently outperforms nedmalloc for a low number of threads
Also given that he hardly mentions the current default - Doug Lea's Malloc

Bottom line: the "fastest" part is B.S.

Re: nedmalloc, again

Posted: Tue Nov 03, 2009 2:18 am
by AndrewAPrice
Also with open source libraries there is a big division between the GPL projects and the non GPL projects, the worlds rarely cross (unless they're smart and choose to duel license).

As for the title of 'fastest' at least the author showed the results of his tests (although they could be biased and testing scenarios specifically tailored for nedmalloc).

I don't like people giving titles like 'best' or 'fastest' without actually testing and showing the criteria. For example, you commonly here "America is the world's fattest nation", "Australia is the world's fattest nation", "Britain has taken over the title as the world's fattest", (depending on the source of the media you're exposed to) when all are actually incorrect: http://www.forbes.com/2007/02/07/worlds ... fat_2.html

Re: nedmalloc, again

Posted: Wed Nov 04, 2009 7:57 pm
by quanganht
what is the real idea behind Nedmalloc? does it use balanced tree?

And a little outof topic, how can a tree be implemented without malloc() ?

Re: nedmalloc, again

Posted: Wed Nov 04, 2009 9:37 pm
by AndrewAPrice
quanganht wrote:And a little outof topic, how can a tree be implemented without malloc() ?
You'd use a pool of memory instead. The simplest way is to to do this 'dynamically' is to allocate a page and treat it as one huge array of elements (bar some memory at the end which you can use as next page pointer).

Code: Select all

struct Tree
{
   Tree *left, *right;
   int value;
   void *objectPtr;
};
You'd keep a pointer to the first unallocated node, and each unallocated node's *right pointer points to the next unallocated node rather than a child. If there are no more unallocated nodes (pointers will be aligned to the node's size, so could set the smallest bit to 1 as a NULL value) allocate another page.

Since at the end of each page there's a pointer to the next page, a simple iteration can free it all.

Re: nedmalloc, again

Posted: Thu Nov 05, 2009 7:27 am
by JamesM
It beat dlmalloc by a fair bit in my test on a Core2duo:

Code: Select all

jm544@pc215s:~/malloctest/nedmalloc$ ./test

Testing standard allocator with 5 threads ...
This allocator achieves 3494427.523653ops/sec under 5 threads

Testing nedmalloc with 5 threads ...
This allocator achieves 10603035.314551ops/sec under 5 threads


nedmalloc allocator is 3.034270 times faster than standard
But, when I plugged my own allocator (SLAM) that I wrote for Pedigree (that's the "standard allocator" in this test):

Code: Select all

jm544@pc215s:~/malloctest/nedmalloc$ ./test

Testing standard allocator with 1 threads ...
This allocator achieves 10296512.547390ops/sec under 1 threads

Testing nedmalloc with 1 threads ...
This allocator achieves 6837715.182484ops/sec under 1 threads


nedmalloc allocator is 0.664081 times faster than standard

Code: Select all

jm544@pc215s:~/malloctest/nedmalloc$ ./test

Testing standard allocator with 5 threads ...
This allocator achieves 13368672.209795ops/sec under 5 threads

Testing nedmalloc with 5 threads ...
This allocator achieves 11023002.913025ops/sec under 5 threads


nedmalloc allocator is 0.824540 times faster than standard
Using nedmalloc's own test app in their SVN, compiling both allocators with -O3, no debugging.

Win. :-)

It should be noted here too that my allocator used glibc's standard malloc() for creating its slabs (it's based on the SLAB allocator), so it was slowed down some there too. I don't have time at the moment to make a memory pool with locking for testing purposes (in pedigree it just fell through to the PMM's allocatePage function).

Re: nedmalloc, again

Posted: Sat Nov 21, 2009 5:22 am
by JamesM
Bump: I think it would be interesting if people with their own custom allocators ran the benchmark and post what your score is. I'm interested :)

Re: nedmalloc, again

Posted: Tue Nov 24, 2009 6:56 am
by jal
JamesM wrote:Bump: I think it would be interesting if people with their own custom allocators ran the benchmark and post what your score is. I'm interested :)
What test program do we use? The nedalloc one?


JAL

Re: nedmalloc, again

Posted: Thu Nov 26, 2009 2:30 am
by quanganht
jal wrote:
JamesM wrote:Bump: I think it would be interesting if people with their own custom allocators ran the benchmark and post what your score is. I'm interested :)
What test program do we use? The nedalloc one?


JAL
Or if u think u can write a better test program ...

Re: nedmalloc, again

Posted: Thu Nov 26, 2009 6:45 am
by JamesM
jal wrote:
JamesM wrote:Bump: I think it would be interesting if people with their own custom allocators ran the benchmark and post what your score is. I'm interested :)
What test program do we use? The nedalloc one?


JAL
Sounds good: that's the one I used. I wanted to beat nedmalloc at its own game, so used its own benchmark.

Re: nedmalloc, again

Posted: Fri Nov 27, 2009 2:23 am
by jal
Hrmpf, the testprogram gets a sigabort if I enable the standard malloc test.


JAL

Re: nedmalloc, again

Posted: Fri Nov 27, 2009 6:21 am
by JamesM
:/ didn't for me, but then again I did test it 32-bit only. I haven't tried compiling it for 64-bit.

Re: nedmalloc, again

Posted: Fri Nov 27, 2009 6:50 am
by jal
JamesM wrote::/ didn't for me, but then again I did test it 32-bit only. I haven't tried compiling it for 64-bit.
I changed the "if (0)" to "if (1)" on line 313 of test.c, compiled it using a simple "gcc -pthread test.c" (gives a warning on line 4137 of malloc.c.h) and ran a.out. Compiled on Ubuntu 9.10. I'll try to see what's the problem...


JAL