views:

433

answers:

4

I have some boolean expressions to evaluate and process. Maybe this would have all been better with Boost, but I'm still learning STL and didn't go that way. I'm now learning about iterator validation, or INvalidation as the case may be. Is there a way to insert a new element into this nested vector below safely? If you don't want to see a grown man cry, don't suggest I rewrite everything :) In all seriousness, I'd also welcome suggestions for how to rewrite this in a more elegant way after I fix my more immediate problem, which I suspect is an invalidated iterator...

... I'm not terribly concerned about performance. Based on that and reading through other posts, perhaps a std::List in place of std::vector would be better, but would I need that at every level of the nesting?

----nested.h

#include <vector>

struct Term {
    uint32_t termNumber;
    std::string content;
    uint32_t importance;
    uint32_t numAppearances;
    uint32_t ContextFlags;
};

struct SubClause {
    std::string typeName;
    std::vector<Term> terms;
    std::string clauseExpression;
};

struct Clause {
    std::vector<SubClause> subClauses;
};

-----nested.cpp

#include <iostream>
#include "nested_container.h"

int main (int argc, char * const argv[]) {

std::vector< Clause > expression;

std::vector< Clause >::iterator clauseIter = expression.begin();
std::vector< Clause >::iterator clauseEnd = expression.end();
for( ; clauseIter != clauseEnd ; clauseIter++ )
{
    std::vector< SubClause >::iterator subIter = clauseIter->subClauses.begin();
    std::vector< SubClause >::iterator subEnd = clauseIter->subClauses.end();
    for( ; subIter != subEnd ; subIter++ )
    {
        std::vector< Term >::iterator termIter = subIter->terms.begin();
        std::vector< Term >::iterator termEnd = subIter->terms.end();

        for( ; termIter != termEnd ; termIter++ )
        {

            /* Evaluate SubClause Terms based on some criteria
             */
            /* if criteria true  */
            if( true/* criteria true? */ )
            {
                Term newTerm = { };
                /* fillOutTerm(newTerm) */
                /* Invalidates the subIter pointer, pretty sure.  Anything else???? */
                subIter->terms.push_back( newTerm ); //BAD?
             }
        }

    }
}
return 0;

}

A: 

There are three things that invalidate vector iterators:

  • destroy the vector :P
  • push_back causing a reallocation (invalidates all iterators)
    • invalidates end() even without a reallocation
  • insert or erase
    • only invalidates iterators after the affected items, except an insert that causes a reallocation, which invalidates every iterator

Other operations which are defined in terms of these, such as assign (erase+insert) or clear (erase), behave similarly.

Vector reallocates when size() needs to exceed capacity(), you can call reserve() to ensure capacity() has a certain value.

Because you append in your inner loop, it needs to look more like:

std::vector<Term>::iterator termIter = subIter->terms.begin();
for (; termIter != terms.end(); ++termIter) {
  if (true/* criteria true? */) {
    Term newTerm = { };
    std::vector<Term>::size_type cur_pos = termIter - subIter->terms.begin();
    subIter->terms.push_back(newTerm);
    termIter = subIter->terms.begin() + cur_pos;
  }
}

This uses a property of random access iterators to save the position and restore it, for termIter, regardless of whether vector had to reallocate for push_back. And since the end iterator is always gotten from the vector, it is always valid.

Even though not everything at the site applies to the stdlib, SGI has a good reference on the STL (it's not the same thing as the stdlib, or the templates in the stdlib) and the concepts it uses.

Roger Pate
Just FWIW, a push_back that causes reallocation invalidates *all* iterators into the vector.
Jerry Coffin
You may be referring to another container. Anything that could cause the std::vector to grow has the potential of invalidating all iterators.
Martin York
Jerry: yes, I meant push_back invalidates end() even without a reallocation; fixed.
Roger Pate
A: 

One option is to create a copy of the inner vector, perform any updates on that, and then swap the copy in.

At the end of the day, it's just about structuring your code so that you don't use iterators after the container (vector in this case) has been modified.

+1  A: 

When you push_back into a vector, you (potentially) invalidate iterators into that vector, but if that vector happens to be one of the items in another vector, iterators into that other vector are unaffected.

On the other hand, in this case, subscripts seem (at least to me) to produce code that's considerably shorter, simpler, and cleaner than iterators:

for (i=0; i<clause.size(); i++) {
    std::vector<term> &terms = clause[i].terms;
    for (j=0; j<terms.size(); j++)
        if (criteria)
            terms.push_back(term(x, y, z));
}

Under the circumstances, using iterators strikes me as a net loss. With sufficient care they'd let you store the data in a std::list (for example), but that doesn't seem (in this case) to compensate for the extra length and difficulty in reading. Iterators are very useful for general algorithms that can reasonably apply to a wide variety of containers, but otherwise they often accomplish little or nothing.

As a minor aside, vector::size() can (at least theoretically) have linear complexity, so for real code you might want to lift the calls to size() out of the respective loops.

Jerry Coffin
A: 

I think you are taking the invalidation too far:

/* fillOutTerm(newTerm) */
/* Invalidates the subIter pointer, pretty sure.  Anything else???? */
subIter->terms.push_back( newTerm ); //BAD?

In this push_back() only iterators that are related too subIter->terms are addected. (ie subIter) is not affected as you are not changing the vector that subIter is a member of. Looking at your code this will potentially invalidate 'termIter' and 'termEnd'.

Martin York