views:

558

answers:

8

I was asked this question in an interview.

The points I answered are like this

1) an index pointing to the current position;

2) resize if neccessary.

Can anybody elaborate more?

A: 

Thanks to some comments, I am completely revising a very incorrect original answer.

According to the STL spec, your answer was correct. The vector is implemented as a dynamically resized array:

Vector containers are implemented as dynamic arrays; Just as regular arrays, vector containers have their elements stored in contiguous storage locations, which means that their elements can be accessed not only using iterators but also using offsets on regular pointers to elements.

But unlike regular arrays, storage in vectors is handled automatically, allowing it to be expanded and contracted as needed.

Heath Hunnicutt
No. Absolutely not. A vector is a dynamic array -- the standard requires that the underlying storage be the exact same as a builtin array. The standard requires only O(1) *amortized* time. Therefore, it need only be O(1) in the average case.
Billy ONeal
The underlying buffer of a vector must be contiguous, meaning it cannot be a linked list. The push_back() method can still run in constant time (on average) if resizing doesn't occur during each call. This typically requires that the buffer increase in size by some factor (e.g. double in size whenever the vector is full).
Void
+2  A: 

Possibly what they were looking for is that push_back makes a copy of the object being pushed onto the vector (using its copy constructor).

With regard to resizing: The standard says a.push_back(x) is equivalent to a.insert(a.end(),x). The definition of insert says, in part: “Causes reallocation if the new size is greater than the old capacity.”

The standard says what the functions are supposed to do. But how they’re implemented is, in most cases, implementation-specific.

Nate
A: 

vector doesn't use a linked list. It uses continuous memory.

If there is not enough reserved space push_back allocates a new chunk of memory twice as large as the original vector. By doing that the amortized runtime is O(1).

Axel Gneiting
...a new chunk that's larger than the old by some constant factor anyway -- but the factor is typically around 1.5 rather than 2.
Jerry Coffin
@Jerry, strange. For a factor of 1.5 two copies for each elements on average are needed, whereas for the factor of 2 it's just 1 copy.
Pavel Shved
@Pavel: Andrew Koenig wrote an article years ago about one reason to favor a smaller factor. With a factor of two, the sum of chunks you've discarded will always be smaller than your next larger allocation, so you never get to reuse them for the same container. As long as the factor is <=the golden mean, they'll eventually add up to a large enough chunk to re-use for the next larger allocation.
Jerry Coffin
@Coffin, this is the case only if the underlying memory manager allocates the chunks thriftily. However, the manager may stick to 2^n chunks which renders the 1.5-scheme useless.
Pavel Shved
+10  A: 

An STL vector has a size (current number of stored elements) and a capacity (currently allocated storage space).

  • If size < capacity, a push_back simply puts the new element at the end and increments the size by 1.
  • If size == capacity before the push_back, a new, larger array is allocated (twice the size is common, but this is implementation-dependent afaik), all of the current data is copied over (including the new element), and the old allocated space is freed. This may throw an exception if the allocation fails.

The complexity of the operation is amortized O(1), which means during a push_back that causes a resize, it won't be a constant-time operation (but in general over many operations, it is).

tzaman
I want also to comment that the amortized cost O(1) is the case if the new capacity is `k` times bigger than the old one. It's also of note that, for a given `k`, each element is copied `1/(k-1)` times on average (for `k=2` it's just one additional copy).
Pavel Shved
+1 for the amortization insight.
andand
Also, `2` is not the most common factor. Most implementations have shifted to the golden ratio because it plays nicer with memory... this had been discussed here on SO and a comp.lang.c++.moderated thread was referenced.
Matthieu M.
Ah, didn't know that. Thanks for the info; can you provide links to the discussions you mention?
tzaman
+1  A: 

That, of course, is inherently implementation defined. Assuming it's a question of how somebody would implement a dynamic array, in general, it'd be something like this:

  1. push_back checks capacity() and ensures it's at least one larger than size().
  2. If there is no capacity for the new element, the vector reallocates it's entire underlying storage, copying over the contents of the old storage to the new storage. The old storage is deallocated.
  3. The new element is copied to the end of the dynamic array.

Some STL implementations will elide some of the copies by using swaps (i.e. for containers of containers), but for the most part that's exactly how it works.

Billy ONeal
A: 

How much detail did the interviewer want? For example, was he looking for you to drill down into the lower level details?

Besides the usual resize-as-needed to retain the O(1) on average semantics, some things to consider include but are not limited to:

  • Exception safety: Does the implementation provide a guarantee that state of the vector will not be modified if an exception is thrown when appending the new element (e.g. rollback semantics)? For example, an exception could be thrown during allocation, copying, insertion, etc.
  • Proper use of allocator: Does the implementation correctly use the vector's allocator instead of plain old new based allocator (both may or may not be the same)? Ideally, this will be handled transparently by the vector's resizing code, implementations could certainly differ, however.
Void
+5  A: 
template< typename T >
void std::vector<T>::push_back(const T& obj)
{
    this->insert(this->end(),obj);
}
sbi
You mean `void std::vector<T>::push_back` ?
Seth Johnson
@Seth: Thanks. I knew I would spoil it...
sbi
Perfect response to a bad question. Technically correct without answering what they *meant* to ask.
Graphics Noob
@Graphics: Maybe I'm dense, but I'm not sure what else was expected. I think asked "How is push_back implemented in `std::vector`?" I would have given this answer in the interview.
sbi
@sbi I would assume they're looking for an answer involving memory management (like tzaman's answer).
Graphics Noob
@Graphics: I had interviews like that. They asked a question and I pointed out the error in it. Didn't wok out.
sbi
A: 
Noah Roberts
This seems like a fairly expensive implementation. You first copy the contents of the vector in reverse, which requires allocation + O(n) iteration to copy of each element in the vector. Then you push the parameter to the *beginning* of the temporary vector, which requires yet another allocation and O(n) accompanying element copy operations. Then you create another copy of the temporary (allocation + O(n) reverse iteration/element copies) that is copy assigned to the vector being operated on. That assignment will result in deallocation + O(n) element destruction operations. Kinda slow. :)
Void