views:

60

answers:

3

hello,

when i use vector to store some data, i usually access to this data by the pointer of the vector's first entry. because it it faster than the at() method. but i realize that when i insert a block of data, say an array to the end of vector, the first entry's pointer changes. this may be realated to stack stuff, but if i add the array one element at a time by push_back, first pointer does not change. so why is that? should i worry about using a pointer to access the elements?

here is a sample code for those who wants to check out:

int arrayLen = 500000;
 vector<int> vint = vector<int>(2000,0);
 int * firstEntry = &vint[0];

 int * iarray = new int[arrayLen];
 for(int i = 0; i< arrayLen; i++)
 {
  iarray[i] = i;
 }
 vint.insert(vint.end(),iarray,iarray+arrayLen);
 cout << firstEntry << "," << &vint[0] << endl; // They ar not equal;

         // reset the vector
 vint.clear();
 vint.resize(2000,0);
 firstEntry = &vint[0];

 for(int i = 0; i< arrayLen; i++)
 {
  vint.push_back(iarray[i]);
  if(firstEntry != &vint[0])
   cout << firstEntry << "," << &vint[0] <<","<< i << endl;
 }// nothing  is written

 cout << firstEntry << "," << &vint[0] << endl; // now they are equal;
+1  A: 

Use reserve() to prevent multiple memory allocations prior to inserting many elements.

Then get your “first pointer” which should not change if you have properly reserved the right number of elements.

But why don't you use begin() ? The * and -> operators work on that, you know…

Benoit
+2  A: 

vectors may reallocate to a larger memory area on the heap when you add elements. You should use [] instead of at() if you want higher performance, but then you must be sure your indices are in range: that's usually easy to ensure. It's very bad practice and totally unnecessary to prefer caching and indexing from a start-of-vector pointer: just turn your optimiser on and make sure you're not using some debug-mode "checked iterator" mode of your STL (if it has one, e.g. Visual C++), and the performance will be identical anyway.

If you know the maximum size to which the vector may grow, then you can pre-size it using reserve().

Tony
@Tony: well there is a very very tiny difference in performance, but when developing an algorithmic trading application, you need all of your guns.
bahadir
+4  A: 

std::vector can dynamically resize. The way it does this is to hold more space than is required. Once you hit the reserved capacity a larger block of data needs to be reserved (depends on implementation, but often the capacity of the new block is double the previous size). The data is then copied across to the new location, which is why the address of the first element changes.

Provided you don't add more data to your vector you don't have to worry about a pointer to data in the vector being invalidated. You can anticipate when a further push_back() will trigger a resize by checking if vint.capacity()==vint.size(). You can also avoid using an invalidated pointer by always using the up-to-date one via &vint[0] (or &vint.at() to get the range checking) rather than copying it.

If you know that you are going to insert a number of new items, you can ensure you will have enough capacity by using reserve() to preallocate the space (if the current capacity is less than what's requested). But be aware that you don't want to do this piecemeal - for instance

vint.reserve(vint.size() + 2000);
for(int i=0; i<2000; ++i) {
    vint.push_back(i);
}

would be fine, but

for(int i=0; i<2000; ++i) {
    vint.reserve(vint.size() + 1);
    vint.push_back(i);
}

would be a performance drag since you're repeatedly asking the OS for more memory and incurring an ever-larger copy operation with each iteration.

beldaz
@beldaz: well i particularly have to add data by insert() method in my applicatin and when i reserve and then insert, first pointer again changes. should i always use push_data in a loop?
bahadir
@bahadir: insert will cause all data after the insertion point to be bumped along. push_back is the nicest way to add data to a vector, but only helps if that's where you want the data. If you're inserting data elsewhere consider deque or list as better-suited alternatives.
beldaz