tags:

views:

109

answers:

3

Hi,

I am writing an OCaml function where I need to merge two maps. I have not been able to figure out the semantics of the merge function provided by functor Map.Make (found in version 3.12.0 of OCaml). Could someone provide me with a more detailed explanation than the OCaml manual? An example would probably enough for me to figure it out.

Additionally, the two maps that I need to merge have some interesting properties: the keys have the same type (int, actually), and their domain is disjoint. Would there be a more efficient approach than the merge routine?

Thanks in advance.

Best regards,

-- David.

+4  A: 

merge takes a function and two maps. For each key present in either map, the function will be called. If the key is only present in one of the maps, the value for the other one will be passed in as None (which is why the arguments are options). If the functions returns Some x, the new map will have the value x for the key in question. Otherwise the key won't be present.

Example:

let map1 = add 1 2 (add 2 3 empty);;
let map2 = add 2 5 (add 3 4 empty);;
let map3 = merge (fun k xo yo -> match xo,yo with
    | Some x, Some y -> Some (x+y)
    | _ -> None
  ) map1 map2;;

map3 now contains the mapping 2 -> 8.

If you change it to:

let map3 = merge (fun k xo yo -> match xo,yo with
    | Some x, Some y -> Some (x+y)
    | None, yo -> yo
    | xo, None -> xo
  ) map1 map2;;

It will contain the mappings 1 -> 2, 2 -> 8 and 3 -> 4.

sepp2k
Crystal clear, thanks.
David
A: 

Based on the first answer, and considering the additional question (merge maps in the case of disjoint domains), I would then propose the following generic merge routine:

let merge_disjoint m1 m2 = 
  IntMap.merge 
    (fun k x0 y0 -> 
       match x0, y0 with 
         None, None -> None
       | None, Some v | Some v, None -> Some v
       | _, _ -> invalid_arg "merge_disjoint: maps are not disjoint")
    m1 m2

Would there be some more efficient way to do this?

-- David.

David
+2  A: 

Since your maps are disjoint then you can just loop through the smaller map and insert into the larger:

let disjoint_merge m1 m2 =
  if (IntMap.cardinal m1) < (IntMap.cardinal m2) then
    IntMap.fold IntMap.add m1 m2
  else
    IntMap.fold IntMap.add m2 m1

In the event that you are using an older version of OCaml that doesn't include the cardinal function you can just choose one map to always iterate through. As for what the above code does, is it uses:

val fold : (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b

Which basically iterates through all the elements a map, and calls a function that takes a key, value and some other variable of type 'b and returns something of that type ('b). In our case we are passing the function IntMap.add and that other variable is our second map. So it iterates through all of the elements and adds them to the other map.

EDIT: You are better off just doing:

let disjoint_merge m1 m2 =
  IntMap.fold IntMap.add m1 m2

EDIT2: even better:

let disjoint_merge = IntMap.fold IntMap.add;;

I just looked at the implementation of cardinal and it walks the tree and returns the count. So by checking sizes you are doing O(n + m + min(n,m)*log(max(n,m))) instead of just O(n*log(m)).

Niki Yoshiuchi