ansaurus

Question

Weird behaviour with vector::erase and std::remove_if with end range different from vector.end()

Answer 1

+1 A:

~~Erasing elements in std::vector invalidates iterators past the removed element, so you cannot use "foreign" functions that accept ranges. You need to do that in a different way.~~

EDIT:

In general, you can use the fact that erasing one element "shifts" all elements at further positions one back. Something like this:

for (size_t scan = 2, end = 4; scan != end; )
  {
     if (/* some predicate on ints[scan] */)
       {
         ints.erase (ints.begin () + scan);
         --end;
       }
     else
       ++scan;
  }

Note that std::vector isn't suited for erasing elements in the middle. You should consider something else (std::list?) if you do that often.

EDIT 2:

As clarified by comments, first paragraph is not true. In such case std::remove_if should be more efficient than what I suggested in the first edit, so disregard this answer. (Keeping it for the comments.)

doublep 2010-04-26 21:28:27

Er, no, it's perfectly safe to use `std::remove_if` on a `std::vector`. The STL is smart enough to avoid using invalidated iterators when doing that.

Tyler McHenry 2010-04-26 21:42:17

@Tyler: Are you sure? I don't see any relevant specializations at least in GNU STL and standard function never modifies its `last` argument.

doublep 2010-04-26 21:53:01

@Tyler, yes it is perfectly safe, but it has nothing to do with the STL being smart. `std::remove_if` simply copies elements toward the front in order to achieve the effect of removing the elements they replace; it does nothing at all to the vector itself or any iterators derived from it. In fact, the tail of the vector is left untouched and contains elements that don't belong in the reduced list. To remove these, you have to truncate the vector using the iterator returned from `std::remove_if`.

Marcelo Cantos 2010-04-26 21:55:06

Well, it's "smart" in the sense that it's implemented in the way you describe, rather than being implemented in a naiive way that would cause it to invalidate iterators.

Tyler McHenry 2010-04-26 21:58:33

Answer 2

+8 A:

Edit: Sorry, the original version of this was incorrect. Fixed.

Here's what's going on. Your input to remove_if is:

1  2  3  4  5  6
      ^     ^
    begin  end

And the remove_if algorithm looks at all numbers between begin and end (including begin, but excluding end), and removes all elements between that match your predicate. So after remove_if runs, your vector looks like this

1  2  3  ?  5  6
      ^  ^
  begin  new_end

Where ? is a value that I don't think is deterministic, although if it's guaranteed to be anything it would be 4. And new_end, which points to the new end of the input sequence you gave it, with the matching elements now removed, is what is returned by std::remove_if. Note that std::remove_if doesn't touch anything beyond the subsequence that you gave it. This might make more sense with a more extended example.

Say that this is your input:

1  2  3  4  5  6  7  8  9  10
      ^              ^
    begin           end

After std::remove_if, you get:

1  2  3  5  7  ?  ?  8  9  10
      ^        ^
    begin      new_end

Think about this for a moment. What it has done is remove the 4 and the 6 from the subsequence, and then shift everything within the subsequence down to fill in the removed elements, and then moved the end iterator to the new end of the same subsequence. The goal is to satisfy the requirement that the (begin, new_end] sequence that it produces is the same as the (begin, end] subsequence that you passed in, but with certain elements removed. Anything at or beyond the end that you passed in is left untouched.

What you want to get rid of, then, is everything between the end iterator that was returned, and the original end iterator that you gave it. These are the ? "garbage" values. So your erase call should actually be:

ints.erase(it, ints.begin()+4);

The call to erase that you have just erases everything beyond the end of the subsequence that you performed the removal on, which isn't what you want here.

What makes this complicated is that the remove_if algorithm doesn't actually call erase() on the vector, or change the size of the vector at any point. It just shifts elements around and leaves some "garbage" elements after the end of the subsequence that you asked it to process. This seems silly, but the whole reason that the STL does it this way is to avoid the problem with invalidated iterators that doublep brought up (and to be able to run on things that aren't STL containers, like raw arrays).

Tyler McHenry 2010-04-26 21:39:48

Nice graphics :)Well, I didn't think it was necessary to omit the erase() call, but the debugger is giving me some weird behaviour after the remove_if(), so I was wondering if I was doing something wrong or this kind of usage is not correct (using lists then, like doublep said)

Edison Gustavo Muenz 2010-04-26 21:44:21

Nice, your extended version really clears things for me. I'll be testing your suggestions and giving feedback soon

Edison Gustavo Muenz 2010-04-26 22:12:57

Answer 3

+1 A:

The behavior isn't weird - you're erasing the wrong range. std::remove_if moves elements it "removes" to the end of the input range. In this case, what you're looking for would be to do:

ints.erase(it, ints.begin() + 4 /* your end of range */);

From C++ in a Nutshell:

The remove_if function template "removes" items for which pred returns false from the range [first, last). The return value is one past the new end of the range. The relative order of items that are not removed is stable.

Nothing is actually erased from the underlying container; instead, items to the right are assigned to new positions so they overwrite the elements for which pred returns false. See Figure 13-13 (under remove_copy) for an example of the removal process.

Nathan Ernst 2010-04-26 21:56:08

ansaurus

tags:

views:

answers:

Weird behaviour with vector::erase and std::remove_if with end range different from vector.end()

related questions