tags:

views:

501

answers:

8

Possible Duplicate:
C programming : How does free know how much to free?

How is is that it is possible for us to delete dynamically allocated arrays, but we can't find out how many elements they have? Can't we just divide the size of the memory location by the size of each object?

A: 

To free an allocated block of memory you can use free(nameofthearray). You must include using #include <malloc.h> To know how many objects a specific array contains you can use sizeof(nameofthearray)/sizeof(typeoftheobjectinthearray).

Miguel
-1. The first works only for dynamically allocated arrays, the second only for static and automatic ones.
larsmans
Sorry, but the original poster clearly stated "How is is that we can delete *dynamically* allocated arrays". So I don't understand your point.
Miguel
-1: The `sizeof` trick doesn't work for arrays allocated on the heap.
Oli Charlesworth
`malloc.h` is not a standard header. What I said is that `sizeof` doesn't work for dynamic arrays.
larsmans
+6  A: 

The memory allocator remembers the size of the allocation, but doesn't give it to the user. This is true in C with malloc and in C++ with new.

"The size of the memory location" cannot be obtained. If you do

int *a = new int[N];
std::cout << sizeof(a);

you'll find that it prints sizeof(int *), which is constant (for a given platform).

larsmans
+2  A: 

Two things work against it

  1. first, arrays and pointers are interchangeable - an array does not have any additional understanding of its length. ( *All smart-arse commentators tempted to comment on the fundamental differences between arrays and pointers should note that none of that makes any difference in respect to this point ;) * )

  2. secondly, because knowing the size of the allocation is the business of the heap, and the heap does not expose any standard way of discovering the size of the allocation.

Symbian, however, does have an AllocSize() function from which you can derive how many elements are in the array. However, sometimes allocations are larger than asked for, because it manages memory in word-aligned chunks.

Will
+11  A: 

In C++, both...

  • the size (bytes) requested by a new, new[] or malloc call, and
  • the number of array elements requested in a new[] dynamic allocation

...are implementation details that the Standard doesn't require be made available programatically, even though the memory allocation library must remember the former and the compiler the latter so it can invoke the destructor on the correct number of elements.

Sometimes the compiler may see there's a constant-sized allocation and be able to associate it reliably with the corresponding deallocation, so it could generate code customised for these compile-time-known values (e.g. inlining and loop unrolling), but in complex usage (and when handling external inputs) a compiler may need to store and retrieve the # elements at run-time: enough space for the #element counter might be put - for example - immediately before or after the address returned for the array content, with delete[] knowing about this convention. In practice, a compiler may choose to always handle this at run-time just for the simplicity that comes with consistency. Other run-time possibilities exist: e.g. the # elements might be derivable from some insight into the specific memory pool from which the allocation was satisfied combined with the object size.

The Standard doesn't provide programmatic access to ensure implementations are unfettered in the optimisations (in speed and/or space) they may use.

(The size of the memory location may be greater than the exact size required for the requested number of elements - that size is remembered by the memory allocation library, which may be a black-box library independent of the C++ compiler).

Tony
@Tony: while I agree with the explanation, since the destructors are supposed to be run, it means the number of elements must be known. That there is no standard API to retrieve it seems inherited from C where there was no destructor to run, and thus where only the allocated size mattered.
Matthieu M.
@Matthieu M.: true that C++ was shaped by C, though C++ evolved away from that based on demand, opportunity, feedback and problem-solving. If programmers consistently asked for this - and could prove utility - it could have been available a long time ago, providing the implementers didn't yell louder re the work or existing optimisations. Anyway, long live std::vector() :-).
Tony
@Matthieu: "since the destructors are supposed to be run, it means the number of elements must be known" - if the contained type of the array has a destructor that does anything. As an optimization, a C++ implementation could store the number of elements requested for a `std::string[]`, but not store it for an `int[]`. I suppose that an API to retrieve the size would either have to formulate when it works and when it doesn't, or else would in effect forbid the optimization.
Steve Jessop
+3  A: 

you can easily make a class to keep track of the allocation count.

the reason we don't know the length is because it has always been an implementation detail (afaik). the compiler knows the elements' alignment, and the abi will also affect how it is implemented.

for example, itanium 64 abi stores the cookie (element count) in the leading bytes of the allocation (specifically, non-POD), then pads to the objects' natural alignment if necessary. you are then returned (from new[]) the address of the first usable element, rather than the address of the actual allocation. so there is a bunch of non-portable bookkeeping involved.

a wrapper class is the easy way to manage this.

it's actually an interesting exercise to write allocators, override object::new/delete, placement operators and look at how this all fits together (although it's not a particularly trivial exercise if you want the allocator to be used in production code).

in short, we don't know the size of the memory allocation, and it is more effort to figure out the allocation size (among other necessary variables) consistently across multiple platforms than it is to use a custom template class which holds a pointer and a size_t.

furthermore, there is no guarantee that the allocator allocated exactly the number of bytes requested (so your counts could be wrong, if you determine count based on allocation size). if you go through malloc interfaces, you should be able to locate your allocation... but that's still not very useful, portable, or safe for any non-trivial case.

Update:

@Default there are many reasons to create your own interface. as Tony mentioned, std::vector is one well known implementation. the basis for such a wrapper is simple (interface borrowed from std::vector:

/* holds an array of @a TValue objects which are created at construction and destroyed at destruction. interface borrows bits from std::vector */
template<typename TValue>
class t_array {
    t_array(const t_array&); // prohibited
    t_array operator=(const t_array&); // prohibited
    typedef t_array<TValue>This;
public:
    typedef TValue value_type;
    typedef value_type* pointer;
    typedef const value_type* const_pointer;
    typedef value_type* const pointer_const;
    typedef const value_type* const const_pointer_const;
    typedef value_type& reference;
    typedef const value_type& const_reference;

    /** creates @a count objects, using the default ctor */
    t_array(const size_t& count) : d_objects(new value_type[count]), d_count(count) {
        assert(this->d_objects);
        assert(this->d_count);
    }

    /** this owns @a objects */
    t_array(pointer_const objects, const size_t& count) : d_objects(objects), d_count(count) {
        assert(this->d_objects);
        assert(this->d_count);
    }

    ~ t_array() {
        delete[] this->d_objects;
    }

    const size_t& size() const {
        return this->d_count;
    }

    bool empty() const {
        return 0 == this->size();
    }

    /* element access */
    reference at(const size_t& idx) {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    const_reference at(const size_t& idx) const {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    reference operator[](const size_t& idx) {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    const_reference operator[](const size_t& idx) const {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    pointer data() {
        return this->d_objects;
    }

    const_pointer data() const {
        return this->d_objects;
    }

private:
    pointer_const d_objects;
    const size_t d_count;
};

as useful as std::vector is, there are some cases where it can be useful to create your own bases:

  • to make an object with a smaller interface. minimalism is good.
  • to make an object which requires no allocator. for example: t_array will result in fewer exported symbols, as well as shorter names for those symbols (by removing the allocator argument).
  • to make a variants which handle additional const cases. in the example above, there is often little reason to change what the container points to. so the above t_array uses 2 const members, each ensure less variation than std::vector. a good optimizer should make use of those details. it also prevents users from making accidental mistakes.
  • to reduce build times. if your needs are as simple as t_array, or even more simple then you can reduce your build times by using a minimal interface.

other cases:

  • to make an object with a larger interface, or more features
  • to make an object with additional debugging facilities
  • to make an object which may be subclassed (most implementations of std::vector are not intended to be subclassed)
  • to make an object which is thread safe
Justin
Beat me to the wrapper class suggestion.
Dalin Seivewright
+1 pending: Do you have any links/documentation regarding the wrapper-thingy? It sounded interesting.
Default
@Default: the most famous wrapper class for C++ is std::vector (google "sgi stl vector" for a great reference page).
Tony
oh, I thought he was talking of some special class with some sort of memory management for dynamic allocation.. well, +1 then :)
Default
@Default updated the response to include the example, and described why one would use such a class (even though `std::vector` is useful for many cases, it is not ideal for every case).
Justin
@Justin: good answer. My main thought is: building your own variation of vector makes your program less portable. So although I enjoy optimal code, I think I'll stick to vector for the time being :) I think what caught my interest would be a class which actually managed the objects heap memory and individual sizes or something.. Memory management and handlers is one of the areas of C++ which I would like to know more about but never had the time to investigate in depth.
Default
Justin
(continued) there are many cases where you may not want to pass a vector, a vector which uses an alternative allocator, or cases such when the client wants to operate only on a specific range of the allocation. a `t_contiguous_allocation_with_size` object wrapper can be written to support iterators - since this is template based, it should be quite easy to add the interfaces you'll need, and to reuse stl algorithms. templates are usually better, but sometimes you need to keep build times down (or implementations private) so the implementation must reside in 1 translation in some cases.
Justin
+3  A: 

The common way in C++ is to use std::vector instead of array.

std::vector has the method size which returns the number of elements.

If possible you should prefer using std::vector instead of array wherever possible.

MOnsDaR
+1 - C++ is all about enabling problems to be solved by library authors.
Daniel Earwicker
+3  A: 

The reason is that the C languages do not expose this information, although it might be available to the specific implementation. (Indeed for array new[] in C++ the size has to be tracked to call the destructors for each object -- but how this is done is up to the specific compiler.)

The reason for this non-disclosure is so that compiler-writers and platform implementers have more freedom in how they implement variable-size memory allocations. It is also not necessary to know this information in general, so it would not make sense to require each C platform to make this info available.

Also, one practical reason (for malloc et al.) is that they do not give you what you asked for: If you ask malloc for 30 bytes of memory, it will most likely give you 32 bytes (or some other larger allocation granularity). So the only information available internally is the 32 bytes, and you as programmer don't have much use for this information.

Martin
A: 

It's all perfectly in the "keep it simple" philosophy of C: you MUST, at one point, have decided what size the array/buffer/whatever needed; so keep that value and that's it. why wasting a function call for retreaving an information you already have?

Lorenzo Stella
Because keeping something around is a pain too, and if it's stored twice it wastes memory, which is un-C/C++-like too. You have to admit there's some benefits to retrieving it from where-ever the implementation's stored it. The "function call" may be inline-able anyway - depending on the implementation.
Tony