views:

468

answers:

4

I am working with VC++ 2005 I have overloaded the new and delete operators. All is fine.

My question is related to some of the magic VC++ is adding to the memory allocation.

When I use the C++ call:

data = new _T [size];

The return (for example) from the global memory allocation is 071f2ea0 but data is set to 071f2ea4

When overloaded delete [] is called the 071f2ea0 address is passed in.

Another note is when using something like:

data = new _T;

both data an the return from the global memory allocation are the same.

I am pretty sure Microsoft is adding something at the head of the memory allocation to use for book keeping. My question is, does anyone know of the rules Microsoft is using.

I want to pass in the value of "data" into some memory testing routines so I need to get back to the original memory reference from the global allocation call.

I could assume the 4 byte are an index but I wanted to make sure. I could easily be a flag plus offset, or count or and index into some other table, or just an alignment to cache line of the CPU. I need to find out for sure. I have not been able to find any references to outline the details.

I also think on one of my other runs that the offset was 6 bytes not 4

+4  A: 

The 4 bytes most likely contains the total number of objects in the allocation so delete [] will be able to loop over all objects in the array calling their destructor..

To get back the original address, you could keep a lookup-table keyed on address / 16, which stores the base address and length. This will enable you to find the original allocation. You need to ensure that your allocation+4 doesn't cross a 16-byte boundary, however.

EDIT: I went ahead and wrote a test program that creates 50 objects with a destructor via new, and calls delete []. The destructor just calls printf, so it won't be optimized away.

#include <stdio.h>

class MySimpleClass
{
    public:
    ~MySimpleClass() {printf("Hi\n");}
};

int main()
{
    MySimpleClass* arr = new MySimpleClass[50];
    delete [] arr;


    return 0;
}

The partial disassembly is below, cleaned up to be a bit more legible. As you can see, VC++ is storing array count in the initial 4 byes.

; Allocation
mov ecx, 36h ; Size of allocation
call    scratch!operator new
test    rax,rax ; Don't write 4 bytes if NULL.
je      scratch!main+0x25
mov     dword ptr [rax],32h ; Store 50 in first 4 bytes
add     rax,4 ; Increment pointer by 4

; Free
lea     rdi,[rax-4] ; Grab previous 4 bytes of allocation
mov     ebx,dword ptr [rdi] ; Store in loop counter
jmp     StartLoop ; Jump to beginning of loop
Loop:
lea     rcx,[scratch!`string' (00000000`ffe11170)] ; 1st param to printf
call    qword ptr [scratch!_imp_printf; Destructor
StartLoop:
sub     ebx,1 ; Decrement loop counter
jns     Loop ; Loop while not negative

This book keeping is distinct from the book keeping that malloc or HeapAlloc do. Those allocators don't care about objects and arrays. They only see blobs of memory with a total size. VC++ can't query the heap manager for the total size of the allocation because that would mean that the heap manager would be bound to allocate a block exactly the size that you requested. The heap manager shouldn't have this limitation - if you ask for 240 bytes to allocate for 20 12 byte objects, it should be free to return a 256 byte block that it has immediately available.

Michael
That can't be all there is too it. In modern compilers you can call delete and not delete [] and it will still act like you called delete []. I know that is bad programming, but I think it does work.
Jim Kramer
@Jim Kramer: If you call delete when you should have called delete[], then the compiler won't call your destructor for all the elements of the array (only the first one, if you're lucky).
Greg Hewgill
I know, but all that memory block will be cleaned up, thus it must know how to interpret the leading information. Don't forget it's the leading information that I am trying to understand and I am not willing to just assume that it's 4 bytes under all conditions.
Jim Kramer
The total size of the memory block is stored in yet another place. This is the same for all allocations based on malloc or new; there is usually a block header stored *before* the pointer that is returned to you. This block header is used by the memory allocation system for bookkeeping, and contains the total size of the allocated block.
Greg Hewgill
@Jim Kramer: See this question for the discussion of using delete instead of delete[] http://stackoverflow.com/questions/787417/why-would-you-write-something-like-this-intentionally-not-using-delete-on-an
sharptooth
+1  A: 

The 4 bytes offset is for the number of elements. When delete[] is invoked it is necessary to know the exact number of elements to be able to call the destructors for exactly the necessary number of objects.

Since the memory allocator could have returned a bigger block than necessary for storing all the objects the only sure way to know the number of elements is to store it in the beginning of the block.

sharptooth
A: 

For sure, memory allocation does need some bookkeeping info stored together with the actual memory.

Next to that, heap allocated blocks will also be 'decorated' with some magic values, used to easily detect buffer overruns, double deletion, ... Take a look at this CodeGuru site for more info. If you want to know the last of heap debugging, take a look at the msdn documentation.

xtofl
A: 

The finial answer is:

When you do a malloc (which new uses under the hoods) allocates more memory than needed for the system to manage memory. This debug information is not what I am interested. What I am interested is the difference between using the return from malloc on a C++ array allocation.

What I have been able to determine is sometimes the C++ adds/uses 4 additional byte to keep count of the objects being allocated. The twist is these 4 bytes are only added if the objects being allocated require being destructed.

So given:

void* _cdecl operator new(size_t size)
{
    void *ptr = malloc(size);
    return(ptr); 
}

for the case of:

object * data = new object [size]

data will be ptr plus 4 bytes (assuming object required a destructor)

while:

char *data = new char [size]

data will equal ptr because no destructor is required.

Again, I am not interested in the memory tracking malloc adds to manage memory.

Jim Kramer