I am thinking of how I can implement std::vector from the ground up.
How does it resize the vector?
realloc only seems to work for plain old stucts, or am I wrong?
I am thinking of how I can implement std::vector from the ground up.
How does it resize the vector?
realloc only seems to work for plain old stucts, or am I wrong?
It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().
it is a simple templated class which wraps a native array. It does not use malloc
/realloc
. Instead it uses the passed allocator (which by default is std::allocator
).
Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.
In addition to this, it will need to store the current "size" and "capacity". Size being how many elements are actually in the vector. Capacity is how many could be in the vector.
So as a starting point a vector will need to look somewhat like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_;
typename A::size_type capacity_;
typename A::size_type size_;
A allocator_;
};
The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end()
(which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size()
call (which now needs a subtraction). In which case it could look like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_; // points to first element
T* end_capacity_; // points to one past internal storage
T* end_; // points to one past last element
A allocator_;
};
I believe gcc's libstdc++ does this, both approaches are equally valid and conforming.
Of course, you could also use the PIMPL idiom to make swap much simpler at the cost of an extra indirection during access. But that's a matter of preference.
realloc only works on heap memory. In C++ you usually want to use the free store.
From Wikipedia, as good an answer as any.
A typical vector implementation consists, internally, of a pointer to a dynamically allocated array,[2] and possibly data members holding the capacity and size of the vector. The size of the vector refers to the actual number of elements, while the capacity refers to the size of the internal array. When new elements are inserted, if the new size of the vector becomes larger than its capacity, reallocation occurs.[2][4] This typically causes the vector to allocate a new region of storage, move the previously held elements to the new region of storage, and free the old region. Because the addresses of the elements change during this process, any references or iterators to elements in the vector become invalidated.[5] Using an invalidated reference causes undefined behaviour
The reimplementation of vector as an exercise is covered in detail in Accelerated C++, a book you should probably read in any case.
Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).
Note that it does not use new []
either -- it uses the allocator that's passed, but that's required to allocate raw memory, not an array of objects like new []
does. You then need to use placement new
to construct objects in place. [Edit: well, you could technically use new char[size]
, and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]
When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factor compared to the old size to meet the requirement for amortized constant complexity for push_back
. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.
You'd need to define what you mean by "plain old structs."
realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.
That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector
exactly in this case!), you'd need to:
malloc/realloc/free
throughout your class.new
" to initialize objects in your memory chunk.This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new
and ::operator delete
to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.
Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector
or thereabouts.