views:

571

answers:

4

I am currently evaluating a few of scalable memory allocators, namely nedmalloc and ptmalloc (both built on top of dlmalloc), as a replacement for default malloc / new because of significant contention seen in multithreaded environment. Their published performance seems to be good, however I would like to check what are experiences of other people who have really used them.

  • Were your performance goals satisfied?
  • Did you experience any unexpected or hard to solve issues (like heap corruption)?
  • If you have tried both ptmaalloc and nedmalloc, which of the two would you recommend? Why (ease of use, performance)?
  • Or perhaps you would recommend another scalable allocator (free with a permissible license preferred)?
+3  A: 

In the past I have needed a very fast method to alloc memory. I found that there wasn't an alloc that was up to the job.

After a couple of days search I came upon boost::pool, which we in our application gave a performance increase of 300x.

We affectivly just call malloc/free on the objects we want to create. Although there is a little setup overhead, with having to malloc a large amount of memory to begin with, but once that is done, this is very fast.

David Ashmore
A: 

I tried to go your path a while ago when faced with a multi-threaded contention and a severe fragmentation problem. After quite abit of testing I concluded that the benefit of these allocators is negligible in most of the interesting cases I had.

The real solution was to pull my own memory manager which was specialized to the tasks I was doing most often.

shoosh
A: 

I have implemented NedMalloc into our application and I am quite content with the results. The contention I have seen before was gone, and the allocator was quite easy to plug in, even the general performance was very good, up to the point the overhead of memory allocations is out application is now close to unmesurable.

I did not try the ptmalloc, as I did not find a Windows ready version of it and I lost motivation once NedMalloc worked fine for me.

Besides of the two mentioned, I think it could be also interesting to try TCMalloc - it has some features which sound better then NedMalloc in theory (like very little overhead for small allocations, compared to 4 B header used by NedMalloc), however as it does not seem to have Windows port ready, it might also turn to be not exactly easy.


After a few weeks of using NedMalloc I was forced to abandon it, because its space overhead has proven to be too high for us. What hit us in particular was NedMalloc seems to be reclaiming the memory it is no longer used to the OS in a bad manner, keeping most of it still committed. For now I have replaced it with JEMalloc, which seems to be not that fast (it is still fast, but not as fast as NedMalloc was), but it is very robust in this manner and its scalability is also very good.


And after a few months of using JEMalloc I haved switched to TCMalloc. It took more effort to adapt it for Windows compared to the other ones, but its results (both performance and fragmentation) seem to be the best for us of what I have tested so far.

Suma
A: 

If you are on Win32 my experience has been that it's hard to beat the regular Windows heap manager provided you enable Low Fragmentation Heap using the HeapSetInformation API. I believe this is now standard on newer versions of Windows. It handles locking using Interlocked* Win32 primitives rather than more simple Mutex/CritSec locking.

Steve Townsend
It may be hard to beat it in single threaded performance and fragmentation, but unfortunately it is far from scalable to multiple cores. It seems to miss the "Thread Caching" offered by other scalable allocator, which they use to avoid locking in a typical situations completely.
Suma
Fair enough. If/when you have measured using some of those, please let me know your results compared to LFH here.
Steve Townsend