tags:

views:

171

answers:

7

Given that I have two STL maps, say:

map<int, double> A;
map<int, double> B;

I'd like to get the intersection of the two maps, something of the form:

map<int, pair<double,double> > C;

Where the keys are the values in both A and B and the value is a pair of the values from A and B respectively. Is there a clean STL-like way to go about this?

+5  A: 
#include <map>
using namespace std;
typedef int KeyType;
typedef double ValueType;

typedef map< KeyType, ValueType > MyMap;
typedef MyMap::iterator MyMapIter;
typedef MyMap::const_iterator MyMapConstIter;

typedef pair< ValueType, ValueType > ValueTypePair;

typedef map< KeyType, ValueTypePair > MyMapIntersection;

int main() {

    MyMap A;
    MyMap B;
    MyMapIntersection C;

    // fill up A, B

    for( MyMapConstIter cit = A.begin(); cit != A.end(); ++cit ) {

        const KeyType x = cit->first;

        MyMapIter found = B.find( x );
        if( found != B.end() ) {

            ValueTypePair valuePair =
                ValueTypePair( cit->second, found->second );
            C.insert( pair< KeyType, ValueTypePair>( x, valuePair ) );
        }
    }

}
ArunSaha
The algorithm can be improved by avoiding the `find` calls. Maps are ordered, and you can iterate both input maps at the same time. Whenever the left and right iterator values differ, advance the minimum of the two. The current algorithm has `O(N log M)` cost, while the improved solution would be `O( max(N,M) )` with `N` and `M` being the two input map sizes. +1 anyway for actually providing a solution that works :)
David Rodríguez - dribeas
Without looking real hard at it, I'd think there'd be something in <algorithm> that would allow you to get rid of the `for` loop.
T.E.D.
@T.E.D.: I don't think there is. Apparently the code is iterating over a single container, but the fact is that the iteration happens with two different containers at once. As it is currently implemented, it would seem as if the missing `copy_if` or the existing `std::remove_copy_if` could be used to filter with a functor that performed the `find`, but that would not provide the second value to compose. `std::transform` could be tried with a functor that produced the composed value, but that would also fail as the functor is required to always produce a value and cannot filter.
David Rodríguez - dribeas
... the `std::set_difference`, again cannot compose the result value, and I am out of ideas now.
David Rodríguez - dribeas
@Niki Yoshiuchi: wrong. A binary tree where each node has a pointer to its parent can be transversed in `O(N)` where `N` is the number of elements. The simple prove is drawing a tree, and painting each one of the links as it is transversed first down and then back up. Each edge is transversed exactly twice during a tree transversal.
David Rodríguez - dribeas
@David: @Niki ?? Forgive me, but I don't see Niki Yoshiuchi in this page.
ArunSaha
@ArunSaha: There was a comment, deleted now, that said that tree transversal was `O( N log N )`, and a whole discussion saying that the parallel transversal of the two trees was slower than transversing one tree and calling `find` on the second map in each step.
David Rodríguez - dribeas
+7  A: 

I don't think there is a pure STL way of implementing what you want. Manual implementation should not be too complicated.

Note that std::set_intersection is not a solution. The main reason being that it compares the dereferenced iterators and then copies one of the elements to the output iterator.

While comparison of the full dereferenced iterator includes the associated value (which I understand you do not want to consider as part of the key), can be solved by providing a comparison functor that would only test the key (std::pair<const Key, Value>::first), the problem of the algorithm copying only one of the two values and not composing the solution cannot be tackled externally.

EDIT: A simple linear time implementation of the function:

Note, as @Mark Ransom comments, that this solution adds an extra requirement: the keys must be equality comparable. That is not an issue with his solution here, or similarly in the answer by @Matthiew M here. It would be shameful to modify this algorithm with that fix :)

Another great advantage of @Mark's implementation is that it can compose from maps that store different value types as long as the keys are the same (which seems like an obvious requirement). I wish I would upvote more than once there..

template <typename Key, typename Value>
std::map<Key,std::pair<Value,Value> > 
merge_maps( std::map<Key,Value> const & lhs, std::map<Key,Value> const & rhs )
{
    typedef typename std::map<Key,Value>::const_iterator input_iterator;
    std::map<Key, std::pair<Value,Value> > result;
    for ( input_iterator it1 = lhs.begin(), it2 = rhs.begin(),
                         end1 = lhs.end(), end2 = rhs.end();
            it1 != end1 && it2 != end2; )
    {
        if ( it1->first == it2->first )
        {
            result[it1->first] = std::make_pair( it1->second, it2->second );
            ++it1; ++it2;
        }
        else
        {
            if ( it1->first < it2->first )
                ++it1;
            else
                ++it2;
        }
    }
    return result;
}
David Rodríguez - dribeas
+1: Elegant and less verbose solution than mine. And the function is templated.
ArunSaha
Your code is remarkably similar to mine, and you appear to have beaten me by 2 minutes. It also looks to be formatted better. Mine does have one simple advantage, it only relies on `operator<` and not `operator==`. Shouldn't matter for `int` keys, but might with something more complicated.
Mark Ransom
@Mark Ransom: The other great advantage of your solution is that it is more generic than mine, allowing to *merge* / *compose* from maps with unrelated value types.
David Rodríguez - dribeas
@David: StackOverflow is all about coming up with the best answers. It wouldn't bother me in the least if you modified your answer based on concepts from mine - indeed I've been waiting for that so I could give you a +1 in good conscience. I'll do it now anyway just for stroking my ego.
Mark Ransom
+1  A: 

EDIT: Since I was pretty sure there was a better STL-like solution to this, I figured one out. It's different enough that I'm posting it as a separate answer.

There are a few tricks to this. Firstly, you'd like to use set_intersection, but you have two maps. The solution is a custom comparator and the std::transform algorithm. Someone more familiar with the standard library than me can probably optimize this, but it works. Note that boost::bind would allow you to cut down on the silly helper functions that make this work.

#include <algorithm>
#include <map>
#include <set>

bool myLess(const std::map<int,double>::value_type &v1,
            const std::map<int,double>::value_type &v2) {
    return v1.first < v2.first;
}
int getKey(const std::map<int,double>::value_type &v) {
    return v.first;
}

struct functor {
    std::map<int,double> &m1,&m2;
    functor(std::map<int,double> &im1, std::map<int,double> &im2) : m1(im1), m2(im2) {}
    std::pair<int,std::pair<double,double> > operator() (int x) {
        return std::make_pair(x, std::make_pair(m1[x],m2[x]));
    }
};

int main() {
    std::map<int,double> m1, m2;
    m1[0]=0;m1[1]=1; m1[2]=2; m1[3]=3;
            m2[1]=11;m2[2]=12;m2[3]=13;m2[4]=14;

    std::set<int> keys1,keys2,keys;
    //Extract the keys from each map with a transform
    std::transform(m1.begin(),m1.end(),std::inserter(keys1,keys1.begin()),getKey);
    std::transform(m2.begin(),m2.end(),std::inserter(keys2,keys2.begin()),getKey);
    //set_intersection to get the common keys
    std::set_intersection(keys1.begin(),keys1.end(),keys2.begin(),keys2.end(),
            std::inserter(keys,keys.begin()));

    std::map<int, std::pair<double,double> > result;
    functor f(m1,m2);  //stash our maps into the functor for later use
    //transform from the key list to the double-map
    std::transform(keys.begin(),keys.end(),std::inserter(result,result.begin()),f);
    return 0;
}

Like much of C++, the final use of everything is fairly slick (everything in main()), but the setup is more verbose than we would really like.

jkerian
I don't quite like this proposed solution for two different reasons, first I cannot find the code in `main` to be *slick*, I find the whole think much less readable than a plain straight implementation. The solution requires to generate two sets with the keys, and a third set with the intersection. While the asymptotic complexity is linear, the memory cost (and number of dynamic allocations) has been tripled. It could be improved by avoiding the first `transform`s by providing a comparison functor to `std::set_intersection`.
David Rodríguez - dribeas
... with the addition of an iterator adapter instead of the `std::inserter` to the `std::set_difference` you could avoid two of the three intermediate containers. At any rate, it would not be clean as the simple double loop. The focus should be in maintainability, and the code here (this is subjective) is less maintainable than the straight implementation.
David Rodríguez - dribeas
I don't think you can avoid the first transform, but I invite you to try. The problem is that set_intersection's input and output types are tied up in the same templated types.
jkerian
Iterator adaptors cut this code down immensely, but that's not quite (though some may argue) an STL implementation.
jkerian
+5  A: 
template<typename KeyType, typename LeftValue, typename RightValue>
map<KeyType, pair<LeftValue, RightValue> > IntersectMaps(const map<KeyType, LeftValue> & left, const map<KeyType, RightValue> & right)
{
    map<KeyType, pair<LeftValue, RightValue> > result;
    typename map<KeyType, LeftValue>::const_iterator il = left.begin();
    typename map<KeyType, RightValue>::const_iterator ir = right.begin();
    while (il != left.end() && ir != right.end())
    {
        if (il->first < ir->first)
            ++il;
        else if (ir->first < il->first)
            ++ir;
        else
        {
            result.insert(make_pair(il->first, make_pair(il->second, ir->second)));
            ++il;
            ++ir;
        }
    }
    return result;
}

I haven't tested this, or even compiled it... but it should be O(n). Because it's templated it should work with any two maps that share the same key type.

Mark Ransom
+1: For generalizing to `LeftValue` and `RightValue`, although in the current problem definition they were same.
ArunSaha
+1 for not imposing an extra requirement (`operator==`) as I did :)
David Rodríguez - dribeas
`typename` is missing for the `const_iterators`.
Georg Fritzsche
+1  A: 

A (better) solution based on the fact that the maps are sorted. (Shame on me that I overlooked it.) Thanks to David Rodríguez - dribeas for the suggestion.

#include <map>
using namespace std;
typedef int KeyType;
typedef double ValueType;

typedef map< KeyType, ValueType > MyMap;
typedef MyMap::iterator MyMapIter;
typedef MyMap::const_iterator MyMapConstIter;

typedef pair< ValueType, ValueType > ValueTypePair;

typedef map< KeyType, ValueTypePair > MyMapIntersection;


void constructInsert( MyMapIntersection & c, MyMapConstIter const & acit,
    MyMapConstIter const & bcit ) {

    ValueTypePair valuePair = ValueTypePair( acit->second, bcit->second );

    c.insert( pair< KeyType, ValueTypePair>( acit->first, valuePair ) );
}


int main() {

    MyMap A;
    MyMap B;
    MyMapIntersection C;

    // fill up A, B

    MyMapConstIter acit, bcit;
    for( acit = A.begin(), bcit = B.begin();
        (acit != A.end()) && (bcit != B.end()); /* Inside loop */ ) {

        const KeyType aKey = acit->first;
        const KeyType bKey = bcit->first;

        if( aKey < bKey ) {

            ++acit;
        }
        else if( aKey == bKey ) {

            constructInsert( C, acit, bcit );

            ++acit;
            ++bcit;
        }
        else {

            ++bcit;
        }
    }

}
ArunSaha
+1  A: 

Okay, let's get ready to get your hands dirty :)

I'll be using std::mismatch and std::transform

First of all, some types:

typedef std::map<int, double> input_map;
typedef input_map::const_reference const_reference;
typedef input_map::const_iterator const_iterator;
typedef std::pair<const_iterator,const_iterator> const_pair;

typedef std::map<int, std::pair<double,double> > result_map;

Then predicates

bool less(const_reference lhs, const_reference rhs)
{
  return lhs.first < rhs.first;
}

result_map::value_type pack(const_reference lhs, const_reference rhs)
{
  assert(lhs.first == rhs.first);
  return std::make_pair(lhs.first, std::make_pair(lhs.second, rhs.second));
}

Now main:

result_map func(input_map const& m1, input_map const& m2)
{
  if (m1.empty() || m2.empty()) { return result_map(); }

  // mismatch unfortunately only checks one range
  // god do I hate those algorithms sometimes...
  if (*(--m1.end()) < *(--m2.end()) { return func(m2, m1); }

  const_pair current = std::make_pair(m1.begin(), m2.begin()),
             end = std::make_pair(m1.end(), m2.end());

  result_map result;

  // Infamous middle loop, the check is middle-way in the loop
  while(true)
  {
    const_pair next = std::mismatch(p.first, end.first, p.second, less);

    std::transform(current.first, next.first, current.second,
      std::inserter(result, result.begin()), pack);

    // If any of the iterators reached the end, then the loop will stop
    if (next.first == end.first || next.second == end.second) { break; }

    // Advance the lesser "next"
    if (less(*next.first, *next.second)) { ++next.first; }
    else                                 { ++next.second; }

    current = next;
  }

  return result;
}

I find this solution quite elegant... notwithstanding the awkard setup part since we need to ensure that the first range ends up quicker than the second because of mismatch...

Notice that the advance is really stupid, we could loop specifically here until we had *next.first.key == *next.second.key but it would complicate the loop.

I really don't find this better than a handcrafted loop though... consider:

result_map func2(input_map const& lhs, input_map const& rhs)
{
  result_map result;

  for (const_iterator lit = lhs.begin(), lend = lhs.end(),
                      rit = rhs.begin(), rend = rhs.end();
       lit != lend && rit != rend;)
  {
    if (lit->first < rit->first)      { ++lit; }
    else if (rit->first < lit->first) { ++rit; }
    else
    {
      result[lit->first] = std::make_pair(lit->second, rit->second);
      ++lit, ++rit;
    }
  }

  return result;
}

It's much more compact, probably more efficient... sometimes the functions you're looking are not general enough to be in the STL :)

Matthieu M.
+1  A: 

The following is a simplification of my previous answer, mostly taking advantage of the fact that set_intersection CAN be used with maps as input, but only if you make the output a set of std::pairs. The result also cuts down intermediates to a single "common keys" list.

#include <algorithm>
#include <map>
#include <set>

struct cK { //This function object does double duty, the two argument version is for
            //the set_intersection, the one argument version is for the transform
    std::map<int,double> &m1,&m2;
    cK(std::map<int,double> &im1, std::map<int,double> &im2) : m1(im1), m2(im2)
    std::pair<int,std::pair<double,double> > operator() (std::pair<int,double> v
        return std::make_pair(v.first, std::make_pair(m1[v.first],m2[v.first]));
    }
    bool operator() (std::pair<int,double> v1, std::pair<int,double> v2) {
        return v1.first < v2.first;
    }
};

int main() {
    std::map<int,double> m1, m2;
    m1[0]=0;m1[1]=1; m1[2]=2; m1[3]=3;
            m2[1]=11;m2[2]=12;m2[3]=13;m2[4]=14;

    // Get the subset of map1 that has elements in map2
    std::set<std::pair<int,double> > sIntersection;
    cK compareKeys(m1,m2);
    std::set_intersection(m1.begin(),m1.end(),m2.begin(),m2.end(),
            std::inserter(sIntersection,sIntersection.begin()),compareKeys);

    // Use a custom transform to produce an output set
    std::map<int, std::pair<double,double> > result;
    std::transform(sIntersection.begin(),sIntersection.end(),
            std::inserter(result,result.begin()), compareKeys);
    return 0;
}
jkerian