views:

413

answers:

4

hello, guys

I'm confused a little while writing own tiny discovering program to clear up how Visual C++ allocates the memory for dynamic arrays. I must note, I have never met technical documents that describe this question on new[]/delete[] operators for any C++ implementation.

Initially I thought that new[] and delete[] are something similar to the following if it is interpreted as simple C:

void fake_int_ctor(int _this) {
    printf("borns with 0x%08X in the heap\n", _this);
}

void fake_int_dtor(int _this) {
    printf("dies with %d\n", _this);
}

void *new_array(unsigned int single_item_size, unsigned int count, void (*ctor)()) {
    unsigned int i;
    unsigned int *p = malloc(sizeof(single_item_size) + sizeof(count) + single_item_size * count);
    p[0] = single_item_size; // keep single item size for delete_array
    p[1] = count; // and then keep items count for delete_array
    p += 2;
    for ( i = 0; i < count; i++ ) {
        ctor(p[i]); // simulate constructor calling
    }
    return p;
}

void delete_array(void *p, void (*dtor)()) {
    unsigned int *casted_p = p;
    unsigned int single_item_size = casted_p[-2];
    unsigned int count = casted_p[-1];
    unsigned int i;
    for ( i = 0; i < count; i++ ) {
        dtor(casted_p[i]); // simulate destructor
    }
    free(casted_p - 2);
}

void test_allocators(void) {
    unsigned int count = 10;
    unsigned int i;
    int *p = new_array(sizeof(int), count, fake_int_ctor); // allocate 10 ints and simulate constructors
    for ( i = 0; i < count; i++ ) {
        p[i] = i + i; // do something
    }
    delete_array(p, fake_int_dtor); // deletes the array printing death-agony-values from 0 to 19 stepping 2
}

This code implies the following structure for dynamic arrays:

-2..-1..0.....|.....|.....|.....
^   ^   ^
|   |   +-- start of user data, slots may have variable size
|   |       depending on "single item size" slot
|   +------ "items count" slot
+---------- "single item size" slot

My VC++ compiler generated the program that produces the following output:

borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
borns with 0xCDCDCDCD in the heap
dies with 0
dies with 2
dies with 4
dies with 6
dies with 8
dies with 10
dies with 12
dies with 14
dies with 16
dies with 18

Obviously, everything is fine in this case. But now when I was trying to discover the nature of "native" VC++ dynamic arrays allocators, I understand that I'm wrong (at least for VC++). So I've got several questions. Where the values of dynamic array sizes are stored in? How do the dynamic arrays allocators work? Which byte-by-byte structure do they use for dynamic arrays? Or... Or could you provide any links that would clarify this for me (VC++ has the highest priority), please?

+1  A: 

It's implementation defined.

Thus the implementation is (and often does) change with different makes/versions of the compiler (on some systems it differs between release and debug versions)

But saying that you are on the correct lines, thou usually there is more information like the actuall size of the block (if an exact fit was not found) sometimes there is an end block linking back to the startblock so that you can follow chains backwards. Etc...

Martin York
+2  A: 

I'm not sure what you are looking for here but fake_int_ctor(int) is printing uninitialized memory in the allocated array. Try something like this instead:

void fake_int_ctor(int& _this) {
    printf("born at %p\n", (void*)&_this);
}

void fake_int_dtor(int& _this) {
    printf("dies at %p\n", (void*)&_this);
}

This should print out the addresses. I'm guessing that this is more along the lines of what you want to see.

This little program isn't really showing anything since you are just allocating a chunk of contiguous storage (ala malloc) and printing out the range of addresses. Nothing really surprising there. The actual storage of arrays is implementation defined. The only thing that is guaranteed is that when you do something like C *p = new C[10], p will point to enough contiguous storage for 10 C objects. How the environment keeps track of what was allocated so that delete [] p calls the destructors for each allocated element is completely implementation defined.

If you really want to dig into this, then start with something like the following snippet. Compile it with assembly listings enabled and look at the generated assembly code.

struct C {
  C(): x(0) {}
  int x;
};

int main() {
  C *p = new C[10];
  for (int i=0; i<10; ++i) {
    p[i].x = i;
  }
  delete [] p;
  return 0;
}

You should be able to figure out how the compiler represents arrays as long as you turn off all of the optimizations.

D.Shawley
Really, I was on the correct line. :)http://blogs.msdn.com/oldnewthing/archive/2004/02/03/66660.aspxThank you. )
Lyubomyr Shaydariv
One more thing. You said, "fake_int_ctor(int) is printing uninitialized memory". Yes, you're quite right, but my code is semantically correct. I mean I've tried to simulate uninitialized object behavior, but I have just simplified "this" pointer and just made it a value, not a reference or pointer, to make the code more clear.
Lyubomyr Shaydariv
+1  A: 

Share some information w/ you, though it's about GNU C/++, but many C/C++ compilers have something which is similar with each other:

When new() implementing code (the lib code exactly) was executed to allocate a dynamic object or array. The allocated memory in which the object and array placed is a part of a memory buffer which lib code real uses, and the structure looks like the figure below:

+--------------------+-------------------------+
| Overhead    Area   |  Your Object or Array   |
+--------------------+-------------------------+
                     ^
                     |
  CObject *pArray ---+

of course you will only use the right area, and cannot overwrite the left area, the Overhead. The pArray returned by "new CObject[]" points to the right area (as shown in above figure) so in general a user will not notice the Overhead (again, the user cannot use the Overhead area).

The size of dynamic allocated array was stored in the Overhead area. When delete[] is called, the lib code will know the array size from the information of the Overhead area.

Lots of things other than the allocated size will be stored in the overhead too.

If you are using low level malloc()/free()/brk() etc to implement new() and delete(), you may reserve an overhead area and return the followed right usable area to the user too, just as GNU C/++.

EffoStaff Effo
Yep, my own solution was almost in this way. :) I'm glad. Thank you. :)
Lyubomyr Shaydariv
A: 

By the way, such approach explains why a destructor is called once in "arrayed" new [] finalizing with simple delete.

Lyubomyr Shaydariv