views:

166

answers:

4

Hi there,

I have the following toy code, intended to remove duplicates from a vector:

void overlap_removal(vector<int> &vec1, vector<int> &vec2) {
  for (vector<int>::iterator it1 = vec1.begin(); it1 != vec1.end(); ++it1) {
    for (vector<int>::iterator it2 = vec2.begin(); it2 != vec2.end(); ++it2) {
      if ((*it1)*(*it2) < 10) {
        vec1.erase();
      }
    }
  }
}

I'm doing a slightly more complicated comparison in the real code, but didn't want to confuse matters. The problem is the segmentation fault that inevitably follows the execution of this: I think this is due to the fact that I'm deleting an element and then continuing to loop over the same vector.

How can I make the code work? Is this even the right starting point? Thanks in advance

+8  A: 

Try remove_if.

The basic idea is you provide a function object such that true is returned if the passed in element should be deleted:

  class ItemInOtherVectorPred
  {
      const std::vector<int>& otherVec;

      ItemInOtherVectorPred(const std::vector<int>& vec) : otherVec(vec) {}

      // return true if removeVecsElem should be deleted
      bool operator()(const int& removeVecsElem) const
      {
          return (otherVec.find(removeVecsElem) != otherVec.end())
      }
  }

Then you use an instance of that object to tell remove_if what to remove from your vector.

  void overlap_removal(vector<int> &vec1, vector<int> &vec2) 
  {
     ItemInOtherVectorPred trueIfItemInOtherVecPred( vec2);
     vector<int>::iterator eraseBeg = 
             std::remove_if( vec1.begin(), vec1.end(), trueIfItemInOtherVecPred);
     vec1.erase(eraseBeg, vec1.end());

  }
Doug T.
Don't forget to `erase` from the return value of `remove_if` to the `end` of the vector to actually remove those values rather than leaving them at the end.
Mark B
@sbi its not the index, its the value type.
Doug T.
@Doug: Sorry. I guess I should go to bed.
sbi
+1  A: 

That's true. Once you delete the element your iterator is invalid. You have to create a new iterator each time you delete the element.

Manoj R
+3  A: 

If I wanted to preserve your logic as far as possible, I would do it like this.

it1 gets updated at the end of the outer loop, depending on whether a match was found in the inner loop.

Use references to pass in the params to avoid copying the inputs and ensure first input reflects changes.

second vector is const.

void overlap_removal(vector<int>& vec1, const vector<int>& vec2) {
  for (vector<int>::iterator it1 = vec1.begin(); it1 != vec1.end(); ) {
    bool match(false);
    for (vector<int>::const_iterator it2 = vec2.begin(); it2 != vec2.end(); ++it2) {
      if (*it1 == *it2) {
        match = true;
        break;
      }
    }
    if (match)
    {
      it1 = vec1.erase(it1);
    }
    else
    {
       ++it1;
    }
  }
}

There are better ways to do this using STL features, but others are posting on that I see. Still, it's good to understand how the vector methods work, even if you can bypass them in this instance.

Steve Townsend
@Doug - sure, I noted that there are better ways to do this. Understanding semantics of `vector::erase` does not hurt OP though.
Steve Townsend
Thanks. If there's a slicker way to do it I'd also be interested.
invisiblerhino
@Steve, see that now, thanks. Comment deleted.
Doug T.
@Doug - np, that was an edit on my part after guilt pangs re hand-written `vector` loop.
Steve Townsend
@invisiblerhino - for slicker way, see (, upvote, accept) @Doug's post. I would make second `vector` param on `overlap_removal` `const`.
Steve Townsend
A: 

If vec2 is large or has many repeated elements (which are uselessly scanned over and over in this function), it may be more efficient to sort the second vector and place it in a (unordered) set, to reduce the complexity.

#include <vector>
#include <unordered_set>
#include <iostream>
#include <iterator>
#include <algorithm>
void overlap_removal(std::vector<int> &v1, const std::vector<int> &v2) 
{
    std::unordered_set<int> s(v2.begin(), v2.end());
    v1.erase(std::remove_if(v1.begin(), v1.end(),
                            [&s](int i){return s.count(i);}),
             v1.end());
}
int main()
{
    std::vector<int> v1 = {5,6,3,2,3,5,1,2,1};
    std::vector<int> v2 = {2,3};
    overlap_removal(v1, v2);
    copy(v1.begin(), v1.end(), std::ostream_iterator<int>(std::cout, " "));
    std::cout << '\n';
}

Or, keeping it C++98

struct IsInSet {
    const std::set<int>& m_s;
    IsInSet(const std::set<int>& s) : m_s(s) {} 
    bool operator()(int i) const { return m_s.count(i); }
};
void overlap_removal(std::vector<int> &v1, const std::vector<int> &v2) 
{
    std::set<int> s(v2.begin(), v2.end());
    v1.erase( std::remove_if(v1.begin(), v1.end(), IsInSet(s)), v1.end());
}
Cubbi