ansaurus

Question

Answer 1

+14 A:

It's implemented by using an underlying array.

It's not possible to implement a vector<T> with a linked list because the standard guarantees the elements in the list will be held in contiguous memory.

JaredPar 2010-01-19 19:52:40

Also, lookup time is important.

Ed Swangren 2010-01-19 20:53:36

Contiguous memory is only required by the upcoming C++ standard. But all available implementation today use an array anyway.This is an indirect requirement by the O(1) access time.

Axel Gneiting 2010-01-19 21:16:39

From the current (2003) standard: "The elements of a vector are stored contiguously" (23.2.4/1).

James McNellis 2010-01-19 22:19:43

@Axel: No, it is already required by C++03. It was not required by C++98, but it was always intended to be contiguous, so C++03 made it explicit.

jalf 2010-01-19 23:24:02

Okay seems that I mixed that one up. I'm sorry.

Axel Gneiting 2010-01-20 00:29:49

@Downvoter, care to explain?

JaredPar 2010-02-02 19:38:51

Answer 2

+2 A:

There is no one way it is implemented. Different implementations can be different, so long as the preserve the semantics and satisfy the requirements.

At any given time, there has to be a primitive array of T to satisfy the requirements of contiguity. However, how it is allocated, grown, shrunk, and freed is up to the implementor.

You can read the implementation for yourself, it's right there in the header file.

I can tell you that no implementations use linked lists. They aren't consistent with the requirements of the standard.

bmargulies 2010-01-19 19:53:14

I will assert unequivocally there is no implementation that is a linked list.

Jason 2010-01-19 19:58:41

Part of the required semantics are O(1) random access. Linked list isn't a possibility.

jamesdlin 2010-01-19 20:05:07

Potatoswatter 2010-01-19 21:18:00

There is a lot of room for difference in how the array gets allocated, freed, and all that other stuff the OP asked about.

bmargulies 2010-01-19 21:18:59

@bmargulies: There is a requirement for amortized constant time on push_back() operations, which requires exponential growth of the underlying array. The only wiggle room is what that growth factor is. I think 2 is most common, but 1.5 has been shown to be optimal IIRC.

Drew Hall 2010-01-19 21:26:18

It gets allocated and freed by the vector's allocator, where the default allocator simply calls `operator new()`, `operator delete()`, and the class' ctor/dtor. Name one potential difference between two implementations besides the resize-step formula.

Potatoswatter 2010-01-19 21:37:51

@Potatoswatter: It's a matter of emphasis. I chose to emphasize the fact that there is abstraction going on here, however minimal, and that the only way to have certain knowledge is to read the implementation. Obviously, my choice was not especially popular. That's why we got votes.

bmargulies 2010-01-19 23:00:59

Answer 3

+2 A:

I believe the STL uses option #2 (or something similar) because a std::vector<> is guaranteed to store the elements in contiguous memory.

If you're looking for a memory structure that doesn't need to use contiguous memory, look at std::deque.

ceretullis 2010-01-19 19:53:44

Yes I know that, but I was wondering how vectors are implemented..... thanks

Arun 2010-01-19 19:55:27

Answer 4

+7 A:

They use a dynamically allocated array that is regrown as needed. It is necessary to use something like an array so that the elements are contiguous in memory which is guaranteed by the standard.

Incidentally, one common way of regrowing an array is to double the size as needed. This is so that if you are inserting n items at most only O(log n) regrowths are performed and at most O(n) space is wasted.

You can read one implementation for yourself at SGI (where STL was originally conceived).

Jason 2010-01-19 19:56:14

Answer 5

A:

Ok I now have an Idea,, For example what I can do is do sth like Data * temp = new Data(max);

if due to push_back if the total length increase by max,, I can create a new element,, for e.g.

max = 2*max;

Data *temp1 = new Data(max);

and copy the content of the previous vector.

This poses a new questions,, if there is more data then more resize and more inefficient.

Arun

Arun 2010-01-19 19:59:32

you can reserve vector storage if you are dealing with large datasets, this will eliminate growth overhead. also, you probably want to use realloc if you want to implement yourself.

aaa 2010-01-19 20:10:02

This should be a seperate question or a comment within your original question, not a response to your own question.

John Dibling 2010-01-19 20:39:45

Answer 6

+3 A:

I believe it is the third option. It can't just use new T[n] because then it would actually have to construct as many objects as it allocates. E.g

std::vector<Foo> v;
v.reserve(10);

If your implementation simply ended up doing new Foo[10] then you'd just have constructed 10 instances of Foo.

Instead it uses its allocator to allocate and deallocate raw memory (without constructing objects), and as needed (for example, when you actually push_back objects) places copy-constructed instances into correct memory locations in its reserve using placement new and removes them with explicit calls to the destructor (something you'd only do in combination with placement new). The allocator class provides following methods for that which I presume vector's implementations use

 void construct(pointer p, const_reference val);

  Returns:
    new((void *)p) T(val)

  void destroy(pointer p);

  Returns:
    ((T*)p)->~T()

(The "returns" probably should read "effect" or similar.)

More about placement new

UncleBens 2010-01-19 20:52:46

Answer 7

+2 A:

Section 23.2.4, ¶1 of the standard requires that arithmetic on pointers into a vector work the same as with pointers into an array.

The elements of a vector are stored contiguously, meaning that if v is a vector where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().

This guarantees that the storage is in an array. Of course, if you resize the array to be bigger, it might get moved in memory.

Potatoswatter 2010-01-19 21:21:29

Answer 8

+1 A:

There's no actual array at all in any decent implementation (if there is, you can't use any object in it without a default constructor), but just raw memory that gets allocated. It gets allocated in a manner that's usually along the lines of doubling every time you need to expand it.

The vector then uses in place allocation to call the constructors of the class in the proper location once each slot actually gets used actually used.

When there is expansion it will try to reallocate in place (but this is a bit silly and doesn't normally work, think windows 98 heap compaction) but usually will end up making a whole new allocation and copying over.

A standard stl vector is always all together, but not all implementations work like that (I know, having written some of them). Probably none are exactly a linked list, though, either.

Charles Eli Cheese 2010-01-19 22:15:42

More or less, but where are you taking the in-place reallocation from?

UncleBens 2010-01-20 17:58:22

Answer 9

A:

From what i have read in books and from the functionality of reserve and and the requirement that elements of vectors be contiguous, This is what i think could be a possible way to implement Vector.

1) Elements of vectors be contiguous , supporting O(1) random access and vectors should be compatible with C arrays. This just implies there are no linked lists.

2) When you call reserve it reserves additional memory. But reserve does call

new T[newSize]

to reserve more memory. Otherwise it will call default constructor. As uncleben explained whenever reserve is called the vector class just allocates more uninitialized memory usin its allocator (if required) and copy construct new objects into that memory using placement new(if more memory has been allocated)

3) Initially vector has some default capacity. for which uninitialized memory is allocated when the vector object is constructed

4) push_back copy constructs the object into the first available location. If required more memory has to be allocated in similar manner as reserve

Yogesh Arora 2010-01-20 00:56:21

Answer 10

+1 A:

A pedagogic (and thus simplified) version of a container called "Vec" is discussed in Chapter 11 of the wonderful (introductory) book "Accelerated C++". What they describe is a stripped-down version of std::vector, but I think it is still worth noting that:

1) they implement their template class in terms of an array,

2) they discuss push_back in terms of the trick (mentioned above) of allocating more storage than is needed, and coming back for more when they run out, and

3) they use allocator<T> for memory management. The new operator is not flexible enough in this context, since it both allocates and initializes memory.

I repeat, though, that this doesn't mean that actual implementations out there are this simple. But since "Accelerated C++" is quite widespread, those interested can find in the relevant chapter one way vector-like objects can be created, copied, assigned, and destroyed.

EDIT: On a related note, I just found the following blog post by Herb Sutter in which he comments on an earlier blog post by Andrew Koenig, regarding whether or not one should be worried about vector elements being contiguous in memory: Cringe not: Vectors are guaranteed to be contiguous.

Alexandros Gezerlis 2010-01-20 06:29:15

ansaurus

tags:

views:

answers:

How is C++ stl vector implemented

related questions