views:

1020

answers:

8

If you are developing a memory intensive application in C++ on Windows, do you opt to write your own custom memory manager to allocate memory from virtual address space or do you allow CRT to take control and do the memory management for you ? I am especially concerned about the fragmentation caused by the allocation and deallocation of small objects on heap. Because of this, I think the process will run out of memory eventhough there is enough memory but is fragmented.

+2  A: 

There used to be excellent third-party drop-in heap replacement library for VC++, but I don't remember the name any more. Our app got 30% speed-up when we started using it.

Edit: it's SmartHeap - thanks, ChrisW

Arkadiy
SmartHeap from MicroQuill
ChrisW
+28  A: 

I think your best bet is to not implement one until profiles prove that the CRT is fragmenting memory in a way that damages the performance of your application. CRT, core OS, and STL guys spend a lot of time thinking about memory management.

There's a good chance that your code will perform quite fine under existing allocators with no changes needed. There's certainly a better chance of that, than there is of you getting a memory allocator right the first time. I've written memory allocators before for similar circumstances and it's a monsterous task to take on. Not so suprisingly, the version I inherited was rife with fragmentation problems.

The other advantage of waiting until a profile shows it's a problem is that you will also know if you've actually fixed anything. That's the most important part of a performance fix.

As long as you're using standard collection classes an algorihtmns (such as STL/BOOST) it shouldn't be very hard to plug in a new allocator later on in the cycle to fix the portions of your code base that do need to be fixed. It's very unlikely that you will need a hand coded allocator for your entire program.

JaredPar
I wish I could vote this answer up 100 times. I used to work with guys who were convinced they could optimize the code better than the programmers who have full time jobs to work on such things. On top of that, they'd never test any of their "optimizations".
17 of 26
I also regret that I can't upvote for every excellent point in it that's well stated.
David Thornley
But sometimes there is that rare occasion where you really do need that custom allocator that aggregates everything and parcels out fixed blocks. I've worked on more than one project where the availability (or lack) of such meant the difference between success and unemployment.
Crashworks
You will know when you do need it, and even then, you can't guarantee that you'll do better than memory allocators that has been refined over the years.
Calyth
A: 

no, I would not.

The chances of me writing a better code then the CRT with the who know how many hundreds of man year invested in it are slim.

I would search for a specialized library instead of reinventing the wheel.

Raz
Not necessarily true - you know what/when you are going to allocate/free some objects, the people that wrote the CRT didn't.It can be efficient to alloc some large amount of memory in one shot and then manage the storage inside that.
Martin Beckett
This is especially true in circumstances where a system must know it will have enough memory to complete at startup.
Martin Beckett
@mgb I agree that there are circumstances where I would be forced to do that. It would be my last resort. I have huge respect to the amount of work and talent that goes into writing a standard library implementation.
Raz
Reinventing the wheel makes sense if you need a special kind of wheel that's not available in the shops.
Patrick
+2  A: 

Was it SmartHeap from MicroQuill?

codekaizen
yes it was - thanks.
Arkadiy
+1  A: 

you opt to write your own custom memory manager to allocate memory from virtual address space or do you allow CRT to take control and do the memory management for you?

The standard library is often good enough. If it isn't then, instead of replacing it, a smaller step is to override operator new and operator delete for specific classes, not for all classes.

ChrisW
+1  A: 

It depends very much on your memory allocation patterns. From my personal experience there are generally one or two classes in a project that needs special considerations when it comes to memory management because they are used frequently in the part of the code where you spend lots of time. There might also be classes that in some particular context needs special treatment, but in other contexts can be used without bothering about it.

I often end up managing those kind of objects in a std::vector or something similar and explicit rather than overriding the allocation routines for the class. For many situations the heap is really overkill and the allocation patterns are so predictable that you don't need to allocate on the heap but in some much simpler structure that allocates larger pages from the heap that has less bookkeeping overhead than allocating every single instance on the heap.

These are some general things to think about:

First, small objects that's allocated and destroyed quickly should be put on the stack. The fastest allocation are the ones that are never done. Stack allocation is also done without any locking of a global heap which is good for multi threaded code. Allocating on the heap in c/c++ can be relatively expensive compared to GC languages like java so try to avoid it unless you need it.

If you do a lot of allocation you should be careful with threading performance. A classic pitfall is string classes that tends to do alot of allocation hidden to the user. If you do lots of string processing in multiple threads, they might end up fighting about a mutex in the heap code. For this purpose, taking control of the memory management can speed up things alot. Switching to another heap implementation is generally not the solution here since the heap will still be global and your threads will fight about it. I think google has a heap that should be faster in multithreaded environments though. Haven't tried it myself.

Laserallan
+1  A: 

From my experience, fragmentation is mostly a problem when you are continuously allocating then freeing large buffers (like over 16k) since these are the one that will ultimately cause an out of memory if the heap cannot find a big enough spot for one of them.

In that case, only these object should have special memory management (to keep the rest simple) like buffer reusing, if they always have the same size, or else, some memory pool.

The default heap implementations shouldn't have any problem finding some place for smaller buffers between previous allocations.

total
Most modern memory managers (for example dlmalloc) allocate and free *LARGE* buffers directly from the system allocator so the pages can be mapped / remapped. Therefore, the *LARGE* allocations almost never cause fragmentation of real physical memory (although they can cause some fragmentation of the virtual pages in the address space). As long as you have good handling for small and medium sized blocks, you should be able to avoid fragmentation from large pages.
Adisak
+2  A: 

Although most of you indicate that you shouldn't write your own memory manager, it could still be useful if:

  • you have a specific requirement or situation in which you are sure you can write a faster version
  • you want to write you own memory-overwrite logic (to help in debugging)
  • you want to keep track of the places where memory is leaked

If you want to write your own memory manager, it's important to split it in the following 4 parts:

  1. a part that 'intercepts' the calls to malloc/free (C) and new/delete (C++). This is quite easy for new/delete (just global new and delete operators), but also for malloc/free this is possible ('overwrite' the functions of the CRT, redefine calls to malloc/free, ...)
  2. a part that represents the entry point of your memory manager, and which is called by the 'interceptor' part
  3. a part that implements the actual memory manager. Possibly you will have multiple implementations of this (depending on the situation)
  4. a part that 'decorates' the allocated memory with information of the call stack, overwrite-zones (aka red zones), ...

If these 4 parts are clearly separated, it also becomes easy to replace one part by another, or add a new part to it e.g.:

  • add the memory manager implementation of Intel Tread Building Blocks library (to part 3)
  • modify part 1 to support a new version of the compiler, a new platform or a totally new compiler

Having written a memory manager myself, I can only indicate that it can be really handy having an easy way to extend your own memory manager. E.g. what I regularly have to do is finding memory leaks in long-running server applications. With my own memory manager I do it like this:

  • start the application and let it 'warm up' for a while
  • ask your own memory manager to dump an overview of the used memory, including the call stacks at the moment of the call
  • continue running the application
  • make a second dump
  • sort the two dumps alphabetically on call stack
  • look up the differences

Although you can do similar things with out-of-the-box components, they tend to have some disadvantages:

  • often they seriously slow down the application
  • often they can only report leaks at the end of the application, not while the application is running

But, also try to be realistic: if you don't have a problem with memory fragmentation, performance, memory leaks or memory overwrites, there's no real reason to write your own memory manager.

Patrick