views:

968

answers:

3

I have a List of Map[String, Double], and I'd like to merge their contents into a single Map[String, Double]. How should I do this in an idiomatic way? I imagine that I should be able to do this with a fold. Something like:

val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }

Furthermore, I'd like to handle key collisions in a generic way. That is, if I add a key to the map that already exists, I should be able to specify a function that returns a Double (in this case) and takes the existing value for that key, plus the value I'm trying to add. If the key does not yet exist in the map, then just add it and its value unaltered.

In my specific case I'd like to build a single Map[String, Double] such that if the map already contains a key, then the Double will be added to the existing map value.

I'm working with mutable maps in my specific code, but I'm interested in more generic solutions, if possible.

+6  A: 

Well, you could do:

mapList reduce (_ ++ _)

except for the special requirement for collision.

Since you do have that special requirement, perhaps the best would be doing something like this (2.8):

def combine(m1: Map, m2: Map): Map = {
  val k1 = Set(m1.keysIterator.toList: _*)
  val k2 = Set(m2.keysIterator.toList: _*)
  val intersection = k1 & k2

  val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
  val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_)) 
  r2 ++ r1
}

You can then add this method to the map class through the Pimp My Library pattern, and use it in the original example instead of "++":

class CombiningMap(m1: Map[Symbol, Double]) {
  def combine(m2: Map[Symbol, Double]) = {
    val k1 = Set(m1.keysIterator.toList: _*)
    val k2 = Set(m2.keysIterator.toList: _*)
    val intersection = k1 & k2
    val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
    val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
    r2 ++ r1
  }
}

// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)

// And finish with:
mapList reduce (_ combine _)

While this was written in 2.8, so keysIterator becomes keys for 2.7, filterKeys might need to be written in terms of filter and map, & becomes **, and so on, it shouldn't be too different.

Daniel
Kinda defeats the point to ignore that requirement.
Jeff
That's why I expanded upon it.
Daniel
Thanks for that.
Jeff
+1  A: 

Interesting, noodling around with this a bit, I got the following (on 2.7.5):

General Maps:

   def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
    listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
      Map(
        s.projection.map { pair =>
        if (m contains pair._1)
          (pair._1, collisionFunc(m(pair._1), pair._2))
        else
          pair
      }.force.toList:_*)
    }
  }

But man, that is hideous with the projection and forcing and toList and whatnot. Separate question: what's a better way to deal with that within the fold?

For mutable Maps, which is what I was dealing with in my code, and with a less general solution, I got this:

def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
    listOfMaps.foldLeft(mutable.Map[A,B]()) {
      (m, s) =>
      for (k <- s.keys) {
        if (m contains k)
          m(k) = collisionFunc(m(k), s(k))
        else
          m(k) = s(k)
      }
      m
    }
  }

That seems a little bit cleaner, but will only work with mutable Maps as it's written. Interestingly, I first tried the above (before I asked the question) using /: instead of foldLeft, but I was getting type errors. I thought /: and foldLeft were basically equivalent, but the compiler kept complaining that I needed explicit types for (m, s). What's up with that?

Jeff
You don't need to use `force` here, because `toList` is strict.
Daniel
As for `foldLeft` vs `/:`, you do realize the object and the first argument are swapped between them? The expression `x foldLeft y` is equivalent to `y /: x`. Beyond that, there's a bunch of syntax issues. Basically, you *have* to write `(y /: x) (folding expression)`, while `foldLeft` can be used as `x.foldLeft(y)(folding expression)`.
Daniel
Yes, I knew about methods ending with : swapping the object with the argument. That's how I wrote the example in the question. I did forget to put y /: x in parens, though, and I bet that was a problem. Thanks!
Jeff
+6  A: 

How about this one:

def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
  (Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
    a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
  }

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)

println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)

And it works in both 2.7.5 and 2.8.0.

Walter Chang
This is exactly how I was trying to do it initially. I didn't think to place the for-comprehension there -- I'm still getting used to using them like this, but it makes sense. In this case I can see how it's much like Python's list comprehensions, which I'm far more comfortable with. Also like the use of the result-bearing if expression inside the call to a.+().
Jeff