views:

64

answers:

5

Hi

I am trying to find the most efficient, optimized and fastest way to compare to std vectors of CString. the strings in question are case-sensitive. I have tried using the == operator for the vector container but this sometimes return false positives. I mean for instance if one vector contains elements in the order (a,b,c) and the other has them in the order (b,c,a) the == operator will return false even thought they share the same data. Another thing is that it does not do case sensitive comparison.

I have thought of using a basic nested loops approach like this:

//Not Tested

BOOL bMatch = TRUE;
for(int i=0; i<Vec1.size();i++)
{
  if(!bMatch)
     break;
  int nComp=0;
  for(int j=0;j<Vec2.size();j++)
  {
     if(vec1[i].CompareNoCase(Vec2[j])==0)
        {
          //We have a match--check next item
          break;
        }
     else
        {
          nComp++;
          if(nComp == Vec2.size()-1)
             {
                 //Reached end of vector and no match found
                 //Vectors don't match
                 bMatch=FALSE;
             }
        }

  }
}

The above code is not tested and I am not sure if there is probably a better way to achieve such comparison without the need of using nested loops.

Would appreciate any advice or help...

+2  A: 

if (a,b,c) and (b,c,a) are the same for you then vector is a bad choice, use std::set or std::multiset instead, and, as already said, compare them with std::equal and pass strcmp as the comparator argument. This answer is valid if by CString you mean C-style null-terminated char arrays. If CString means MFC CString, FredOverflow's answer is perfect.

Armen Tsirunyan
+5  A: 

if one vector contains elements in the order (a,b,c) and the other has them in the order (b,c,a) the == operator will return false even thought they share the same data.

Simply insert the data into two containers where the order does not matter and compare those:

std::vector<CString> vec1;
std::vector<CString> vec2;

// ...

std::multiset<CString> set1(vec1.begin(), vec1.end());
std::multiset<CString> set2(vec2.begin(), vec2.end());

bool equal_data = (set1 == set2);

If you want to ignore the case (which the code in your question seems to suggest), you can parameterize std::multiset and std::equal with an appropriate comparator:

struct compareNoCase
{
    bool operator()(const CString& a, const CString& b)
    {
        return a.CompareNoCase(b);
    }
};

std::vector<CString> vec1;
std::vector<CString> vec2;

// ...

std::multiset<CString> set1(vec1.begin(), vec1.end(), compareNoCase());
std::multiset<CString> set2(vec2.begin(), vec2.end(), compareNoCase());

bool equal_data = std::equal(set1.begin(), set1.end(),
                             set2.begin(),
                             compareNoCase());

The parameterization of std::multiset guarantees that "hello" and "HELLO" in the same vector are treated as one value, and the parameterization of std::equal guarantees this across the two vectors.

And finally, if you know that no element occurs twice in the same vector, you can use set instead of multiset. Note that it's probably better to work with a set or multiset right from the start.

FredOverflow
I believe the OP said compare vectors of C-Strings. In which case you can't compare two sets of char* with == and get reasonable results. But the OP may have naturally mixed up terminology
Armen Tsirunyan
@Armen: I'm not certain what a `CString` is supposed to be. I will wait for the OP to clarify and then change my post if necessary.
FredOverflow
@Red: Well, do you want "hello" to be equal to "HELLO" or not?
FredOverflow
yes they should be equal....
Red Serpent
@Red: I have updated my post accordingly.
FredOverflow
A: 

Sort them first with std::sort, then compare them with std::equal.

PigBen
A: 

Use std::sort to sort the two vectors. Then you can use a number of items from the std::algorithm.

  1. there's set_difference which if it yields an empty set tells you that the first of the vectors is wholey contained within the second. See here.
  2. there's set_intersection which yields the intersection of both vectors. In this case if the vector::size() of the newly created list matches that of the old you're golden. See here.
  3. there's set_symmetric_difference which yields a container of all elements not found in one set but found in the other. Hence, if it's empty you're golden. See here.

Here's sample code:

list<CString> symDifList;
set_symmetric_difference( myVect.begin(), myVect.end(), myOtherVect.begin(), myOtherVect.end(), back_inserter( symDifList ), equal_to<CString>() );
return symDifList.empty();

Wherein I've used back_inserter from #include <iterator>, equal_to from '#include and of course,set_symmetric_difference'.

wheaties
A: 

Dont't use simple for loop. Instead you can use iterators to retrieve the elements from both vectors and then compare the values either using _tcscmp or wcscmp.

Raghuram Reddy N