views:

73

answers:

4

Hi folks, here's a simple test I did on MSVC++ 2010 under windows 7:

// A struct with sizeof(s) == 4, e.g 4 bytes
struct s
{
    int x;
};

// Allocate 1 million structs
s* test1 = new s[1000000];

// Memory usage show that the increase in memory is roughly 4 bytes * 1000000 - As expected
// NOW! If I run this:
for (int i = 0; i < 1000000; i++)
    new s();

// The memory usage is disproportionately large. When divided by 1000000, indicates 64 bytes per s!!!

Is this a common knowledge or am I missing something? Before I always used to create objects on the fly when needed. For example new Triangle() for every triangle in a mesh, etc.

Is there indeed order of magnitude overhead for dynamic memory allocation of individual instances?

Cheers

EDIT:

Just compiled and ran same program at work on Windows XP using g++: Now the overhead is 16 bytes, not 64 as observed before! Very interesting.

+1  A: 

Is that for a debug build? Because in a debug build msvc will allocate "guards" around objects to see if you overwrite past your object boundary.

Blindy
Both release and debug builds. Just to be sure I've tried on a 32 bit system and still have a significant overhead for individual allocations.For sure there must be some overhead, but I'd assume it shouldn't be so big.
Vadim
+2  A: 

Not necessarily, but the operating system will usually reserve memory on your behalf in whatever sized chunks it finds convenient; on your system, I'd guess it gives you multiples of 64 bytes per request.

There is an overhead associated with keeping track of the memory allocations, after all, and reserving very small amounts isn't worthwhile.

Brian Hooper
Well, the problem is also evident with much larger objects - The original problem was for objects with size of about 50-80 bytes.That's when I started investigating and ran the array vs individual allocation test.
Vadim
Hmmm. I'll have a go and see what happens on my machine.
Brian Hooper
Tried it; the usage for such objects is 4 bytes (in bulk) or 32 bytes (individually) for objects containing an int. Expanding the object to 16 ints, I got 64 bytes per object (in bulk), and 78 bytes per object (individually). That suggests to me about 16 bytes of overhead per allocation and aligning on 16 byte boundaries; some figure of that order, anyway. I'm using Ubuntu 10.4, btw.
Brian Hooper
A: 

There is usually overhead with any single memory allocation. Now this is from my knowledge of malloc rather than new but I suspect it's the same.

A section of the memory arena, when carved out for an allocation of (say) 30 bytes, will typically have a header (e.g., 16 bytes, and all figures like that are examples only below, they may be different) and may be padded to a multiple of 16 bytes for easier arena management.

The header is usually important to allow the section to be re-integrated into the free memory pool when you're finished with it.

It contains information about the size of the block at a bare minimum and may have memory guards as well (to detect corruption of the arena).

So, when you allocate your one million structure array, you'll find that it uses an extra 16 bytes for the header (four million and sixteen bytes). When you try to allocate one million individual structures, each and every one of them will have that overhead.

I answered a related question here with more details. I suspect there will be more required header information for C++ since it will probably have to store the number of items over and above the section size (for proper destructor calls) but that's just supposition on my part. It doesn't affect the fact that accounting information of some sort is needed per allocated item.

If you really want to see what the space is being used for, you'll need to dig through the MSVC runtime source code.

paxdiablo
Yes, that makes perfect sense. I guess for some reason I get more padding and header etc reserved.So what is a general guideline then for dynamically allocating multiple objects?The reason for individual allocation is to work around vector<SuperClass> where one can't push_back subclasses. so I use vector<SuperClass*>I'm now considering vector<SubClass> subclasses...
Vadim
Well, the rule I follow is: if I'm going to allocate just a few, I allocate them singly. The wasted memory doesn't really affect me. If I'm going to allocate a lot, I allocating a collection of them (such as an array or some abstract data type) so that the overhead is minimised. Not sure how to solve your specific vector problem, most of my work is either C or Java both of where I'd probably write my own vector stuff if there was nothing suitable :-)
paxdiablo
+1  A: 

You should check the malloc implementation. Probably this will clear things up.

Not sure though if MSVC++'s malloc can be viewed somewhere. If not, look at some other implementation, they are probably similar to some degree.

Don't expect the malloc implementation to be easy. It needs to search for some free space in the allocated virtual pages or allocate a new virtual page. And it must do this fast. As fast as possible. And it must be multithreading safe. Maybe your malloc implementation has some sort of bitvector where it safes which 64 bit chunks are free in some page and it just takes the next free chunk.

Albert