views:

1201

answers:

4

How can I implement a bipartite matching algorithm (probably based on a max-flow algorithm) in C or C++ ?

To be specific, I have this input in a file: (1,3) (1,5) (2,5)

(M,F) --> where M represents id of MALE and F is id of FEMALE.

I need to find the maximum number of matches and show matched couples. Like: matches: 1&3 , 2&5

I have read in some books I can base this problem on a "maximum flow in a network" algorithm, but I couldn't find any specific info other than the sentence "this problem can be solved by .... algorithm". I have little knowledge about max-flow, and dont know how to implement it either...

+4  A: 

Yes, bipartite matching can be reduced to maximum flow:

  1. You're given sets of nodes M and F. Add a directed edge from a node m in M to a node f in F if you've got the pair (m, f) in your file.

  2. Add a single node S with a directed edge from S to every node in M (this is your "super-source" node).

  3. Add a single node T with a directed edge from every node in F to T (this is your "super-sink" node).

  4. Now, you need to find the maximum flow (with all your edges of weight 1) from S to T.

So what the heck is maximum flow? A flow from S to T is a set of edges such that for each node (except S and T), the weight of its in-flux edges is the same as the weight of its out-flux edges. Imagine that your graph is a series of pipes, and you're pouring water in the system at S and letting it out at T. At every node in between, the amount of water going in has to be the same as the amount of water coming out.

Try to convince yourself that a flow corresponds to a matching of your original sets. (Given a flow, how to you get a matching? Given a matching, how to you get a flow?)

Finally, to find the maximum flow in a graph, you can use the Ford-Fulkerson algorithm. The above wikipedia page gives a good description of it, with pseudo-code.

Jesse Beder
+1  A: 

Yes, if you already have code to solve the max-flow problem, you can use it to solve bipartite matching by transforming the graph as shown towards the end of this lecture, but that's probably not the right approach if you are starting from scratch. If you just want to implement some fairly simple code to solve the problem for examples that don't get too huge, you are better off using a simple augmenting path approach as outlined here. That gives you an O(|V||E|) approach that is pretty easy to code and adequate for all but very large graphs. If you want something with better worst-case performance, you could try the Hopcraft-Karp algorithm, which finds multiple augmenting paths at once and has a O(sqrt(|V|)|E|) run time bound, but the Wikipedia article on it notes that:

Several authors have performed experimental comparisons of bipartite matching algorithms. Their results in general tend to show that the Hopcroft–Karp method is not as good in practice as it is in theory: it is outperformed both by simpler breadth-first and depth-first strategies for finding augmenting paths, and by push-relabel techniques.

In any case, you should definitely understand and be able to implement a simple augmenting-path approach before trying to tackle either Hopcraft-Karp or one of the push-relable techniques mentioned in the references of the Wikipedia article.

Edit: For some reason, the links above aren't showing up correctly. Here are the URLs in question: (http://oucsace.cs.ohiou.edu/~razvan/courses/cs404/lecture21.pdf), (http://www.maths.lse.ac.uk/Courses/MA314/matching.pdf), and (http://en.wikipedia.org/wiki/Hopcroft–Karp_algorithm).

dewtell
This gives me an error: http://www.maths.lse.ac.uk/Courses/MA314/matching.pdf is this link correct?
Karussell
A: 

Here is an experimental study of flow algorithms for maximum bipartite matching:

@article{cherkassky98, author = {Boris V. Cherkassky and Andrew V. Goldberg and Paul Martin and Joao C. Setubal and Jorge Stolfi}, title = {Augment or Push: A Computational Study of Bipartite Matching and Unit Capacity Flow Algorithms}, journal = {Journal of Experimental Algorithmics}, volume = 3, number = 8, year = 1998 }

The winner was a push-relabel algorithm, which I believe was the implementation from Andrew Goldberg's "BIM" package, which you can download here:

http://www.avglab.com/andrew/soft.html

Mind you, if it's important that you code up the solution yourself, you might want to settle for Ford-Fulkerson, as Jesse suggested. If you do that, I recommend you use breadth-first search, not depth-first search, to find the augmenting path (for reasons explained in the article above).

A: 

The QuickGraph library includes a bipartite matching algorithm, which I just worked on and checked in a fix for. It wraps the Edmonds Karp maximum flow algorithm.

The only documentation for the algorithm so far is the unit tests I added. If anyone would like to add a (hopefully faster) implementation which does not simply wrap a maxflow algorithm, please contact me.

RyanHennig