I'm using custom allocators here; you might even say it was to work around other custom dynamic memory management.
Background: we have overloads for malloc, calloc, free, and the various variants of operator new and delete, and the linker happily makes STL use these for us. This lets us do things like automatic small object pooling, leak detection, alloc fill, free fill, padding allocation with sentries, cache-line alignment for certain allocs, and delayed free.
The problem is, we're running in an embedded environment -- there isn't enough memory around to actually do leak detection accounting properly over an extended period. At least, not in the standard RAM -- there's another heap of RAM available elsewhere, through custom allocation functions.
Solution: write a custom allocator that uses the extended heap, and use it only in the internals of the memory leak tracking architecture... Everything else defaults to the normal new/delete overloads that do leak tracking. This avoids the tracker tracking itself (and provides a bit of extra packing functionality too, we know the size of tracker nodes).
We also use this to keep function cost profiling data, for the same reason; writing an entry for each function call and return, as well as thread switches, can get expensive fast. Custom allocator again gives us smaller allocs in a larger debug memory area.