ansaurus

Question

How do I get characters common to two vectors in C++?

Answer 1

A:

Maybe you should be using std::strings instead of vectors, if you have characters in them? Strings have plenty of functionality for searching etc.

Tronic 2010-03-08 19:33:44

Answer 2

+7 A:

I think you're looking for std::set_intersection. The source vectors have to be sorted though. If you don't care about the order of your output vector, you could always run it on sorted copies of your source vectors.

And BTW, the manual naive way isn't horribly complex. Given two source vectors s1 and s2, and a destination vector dest, you could write something that looks like this:

for (std::vector<char>::iterator i = s1.begin(); i != s1.end(); ++i)
{
    if (std::find(s2.begin(), s2.end(), *i) != s2.end())
    {
        dest.push_back(*i);
    }
}

You have a lot of options for the find step depending on your choice of data structure.

Kristo 2010-03-08 19:34:00

Thank you. I was expecting it to be something like this.

Sam Phelps 2010-03-08 19:36:36

`set_intersection` only works if both vectors are sorted.

Jon-Eric 2010-03-08 19:37:02

@Jon-Eric: I believe Kristo already said that....

Billy ONeal 2010-03-08 19:38:35

@BillyONeal Kristo's answer didn't include that part when I left the comment.

Jon-Eric 2010-03-08 19:40:28

@Jon-Eric: Actually, it did. The first revision has that information. Check the edit history if you don't believe me.

Billy ONeal 2010-03-08 19:45:55

Ok guys, no need to argue. I built up my answer with several edits in quick succession. Apparently, not all of them were saved as separate revisions by the SO engine.

Kristo 2010-03-08 19:47:44

It is important to note that `binary_search` requires the range to be ordered, so that manual implementation will only work if `s2` is in fact ordered. If none of the ranges is ordered you can use `std::find` instead of `std::binary_search` which will be slower (`O( n^2 )` for the whole algorithm) but does not require ordering.

David Rodríguez - dribeas 2010-03-08 19:59:00

@David, good catch. I think I was updating my answer as you wrote your comment. :-)

Kristo 2010-03-08 20:02:09

Answer 3

+1 A:

int temp[5000]; // declare this globally if you're going to be 
                // doing a lot of set_intersection calls   

int main() {

  char x[]={'a','b','c','d','e'};
  char y[]={'b','c','g'};
  vector<char> v1(x,x+sizeof x/sizeof x[0]);
  vector<char> v2(y,y+sizeof y/sizeof y[0]);
  sort(v1.begin(),v1.end());
  sort(v2.begin(),v2.end());  // the vectors *must* be sorted!!!!!!

  vector<char> inter=vector<char>(temp,set_intersection(v1.begin(),v1.end(),v2.begin(),v2.end(),temp)); // inter contains {'b','c'}
  int cnt=set_intersection(v1.begin(),v1.end(),v2.begin(),v2.end(),temp) - temp;    // cnt=2

  for(int i = 0; i < (int)inter.size(); ++i) {
    cout<<inter[i]<<" ";
  }
  cout<<endl;

  return 0;
}

dcp 2010-03-08 19:40:38

Let me check I understand this, as I think this has helped me understand the stuff about set_intersection I have found since posting the question.inter contains b and c, which are the characters common to x and y right?

Sam Phelps 2010-03-08 19:43:30

@Sam Phelps - Yes, that is right. And cnt contains the number of elements that are in the intersection (I just put that there in case you only needed to get the count of intersected elements for some reason).

dcp 2010-03-08 19:44:23

It might be clearer to use insert iterators instead of allocating a fixed size array for your destination vector.

Kristo 2010-03-08 19:53:55

Answer 4

+1 A:

Use set_intersection. Here's a working sample:

#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

int main()
{
    vector<string> v1;
    v1.push_back("Mary");
    v1.push_back("had");
    v1.push_back("a");

    vector<string> v2;
    v2.push_back("a");
    v2.push_back("little");
    v2.push_back("lamb");

    sort(v1.begin(), v1.end());
    sort(v2.begin(), v2.end());

    vector<string> v3;
    set_intersection(v1.begin(), v1.end(), v2.begin(), v2.end(), back_inserter(v3));

    copy(v3.begin(), v3.end(), ostream_iterator<string>(cout, "\r\n"));
    return 0;
}

John Dibling 2010-03-08 19:44:29

Answer 5

+3 A:

If I had to do this on two unsorted vectors (without library help), I think I'd add all the elements of one to a hashtable then iterate through the second looking up each--should be more efficient than sorting both lists first as well.

Bill K 2010-03-08 19:44:41

Answer 6

+1 A:

andand 2010-03-08 22:10:48

Answer 7

A:

Since it turns out from your later question you only actually care about 26 characters:

std::bitset<26> in;
for (std::vector<char>::iterator it = first.begin(); it != first.end(); ++it) {
    in[*it - 'a'] = true;
}
for (std::vector<char>::iterator it = second.begin(); it != second.end(); ++it) {
    if (in[*it - 'a']) {
        result.push_back(*it);
        // this line is only needed if 'second' can contain duplicates
        in[*it - 'a'] = false;
    }
}

In fact a bitset<UCHAR_MAX> is small on almost all architectures. Just watch out for those DSPs with 32-bit chars, and be cautious adapting this technique to wchar_t.

With BOOST_FOREACH, the code even looks reasonable:

assert(UCHAR_MAX <= 512 && "What kind of crazy machine is this?");
std::bitset<UCHAR_MAX> in;

BOOST_FOREACH(unsigned char c, first) {
    in[c] = true;
}

BOOST_FOREACH(unsigned char c, second) {
    if (in[c]) {
        result.push_back(c);
        // this line is only needed if 'second' can contain duplicates
        in[c] = false;
    }
}

Steve Jessop 2010-03-10 01:10:35

ansaurus

tags:

views:

answers:

How do I get characters common to two vectors in C++?

related questions