views:

1002

answers:

8

I'm trying to decide between going with a pre-made graph/node network library or to roll my own.

I'm implementing some graph search algorithms which might require some significant customization to the class structure of the node and/or edges.

The reason I'm not sure what to do is that I'm unsure if customization of a pre-made might be more expensive/trouble than making my own in the first place. I'm also curious, but less so, of the performance trade-offs.

Is there anyone out there have direct experience with using one of the libraries out there and have advice based on a success or failure story? I want to hear the worst so that whatever I chose, I know what I'm getting into.

There are only two that I've found in my search so far: The Boost Graph Library (BGL) and GOBLIN. Specific advice on either of these, or suggestions for others is greatly appreciated as well. BGL seems pretty damn arcane. Is it worth struggling through?

+6  A: 

I think if you can leverage the graph traversal algorithms, it is worth using the Boost Graph. Creating and using custom vertices is pretty easy. The examples included with BGL were useful for learning how to use it.

I agree with Clifford, I've used GraphViz to help visualize the graph during development and found it very useful.

i_like_caffeine
I'm curious, what do you use BGL for? Have you run into any issues? I prefer to hear the worst even if I'm going to choose something so I know what I'm getting into.
Catskul
+12  A: 

I can perhaps provide a little guidance on the BGL.

The library is very flexible. The cost of this is that the syntax can be very baroque, in order to accommodate all the possibilities. However, it is sufficiently flexible that simple things can be done simply.

Unfortunately the boost documentation goes at things full tilt, providing a description only of the full complexity, without a hint of how simple things can be.

( "Any sufficiently advanced technology is indistinguishable from magic" - Arthur C. Clarke. What he should have said is "Any advanced technology, sufficiently badly documented, is indistinguishable from magic )

Consider:

typedef property_map<Graph, vertex_index_t>::type IndexMap;
IndexMap index = get(vertex_index, g);
typedef graph_traits<Graph>::vertex_iterator vertex_iter;
std::pair<vertex_iter, vertex_iter> vp;
for (vp = vertices(g); vp.first != vp.second; ++vp.first) {
   std::cout << index[*vp.first] <<  " ";
}

This is how the "quick tour" suggests we print out a list of the graph vertices. However, a little study shows that a vertex is no more than an integer index, and the code can be greatly simplified to

for (int v = *vertices(g).first; v != *vertices(g).second; ++v)
    std::cout << v << " ";

Perhaps there are some magical things that cannot be achieved with this simplified code, but for every day use it reasonable to drastically prune the syntax that encrust BGL so you can see what your are coding.

Sometimes the elaborate syntax cannot be removed. ( Or perhaps I have just not noticed the underlying truth ). Then I usually use a little utility function to encapsulate the complexity abd keep it away from the algorithm I am working on.

For example, I often need to loop over the children of a vertex

vector<int> getVertexChildren( int v )
{
    vector<int> vc;
    typedef std::pair<graph_traits<graph_t>::out_edge_iterator, graph_traits<graph_t>::out_edge_iterator> out_edge_iter_pair_t;
    for( out_edge_iter_pair_t ep = out_edges(v,m_tree);
     ep.first != ep.second; ++(ep.first))
    {
     vc.push_back( target( *ep.first, m_tree ) );
    }
    return vc;
}
#define FOR_ALL_CHILDREN( v ) vector<int> vc=getChildren(v); BOOST_FOR_EACH( int child, vc )

The bottom line is: go ahead and use BGL. It can be simplified to do simple things, but once you have learned to use it, all the immense flexibility will be available whenever you do need it.

ravenspoint
What is your experience in debugging it? I have the fear that debugging with BTL might be a nightmare.
Catskul
Also, have you tried writing new functionality for it? I'm looking at dijkstra_shortest_paths.hpp as an example, and am astounded at how arcane it is : /
Catskul
Debugging is more difficult than home brewed code. My debugger (MSVS2008 ) cannot penetrate into the deep data structures used by BGL. However, you get a lot less bugs than with home brew.I have not added new functionality to BGL.
ravenspoint
(+1) for the comment on the comment by Arthur C. Clarke.
TheMachineCharmer
+4  A: 

“require some significant customization to the class structure of the node and/or edges.”

Have you looked at the “bundled properties” feature of the BGL?

http://www.boost.org/doc/libs/1%5F40%5F0/libs/graph/doc/bundles.html

This is both powerful and not the least bit acrcane.

You define a classes for your edges and your vertices

class cMyVertex
{
…
};
class cMyEdge
{
…
};

Now you define a graph type which will use your classes

typedef boost::adjacency_list<
    boost::listS, boost::vecS, boost::bidirectionalS,
    cMyVertex, cMyEdge > graph_t;

Now you can create graphs of this type that will use your classes, whatever they are, for the vertices and edges.

You can access the attributes and methods of your classes in a perfectly natural way

Graph_t g(10)    // create a graph with 10 vertices
g[1].setA1( “alpha” );  // invoke an accessor on node #1
ravenspoint
+1 for hinting towards bundled properties - they really do ease things a lot for many common use cases.
Steffen Opel
+3  A: 

I rolled my own. You shouldn't follow that example, unless you're absolutely certain you need to.

Still, it's a great learning experience, if you're graph theory is rusty.

I had lots of "fun" with combining digraphs using EBNF operators, and doing epsilon elimination and Hopcroft-based minimisation. Especially with the minimisation, if you can call desperately trying to find good information and figure out how it works "fun" :-(

If you do roll your own, I recommend separating high level operations from the low level digraph data structures - different classes, not just different methods. I didn't - and pretty soon I'll be refactoring heavily because of that poor decision.

Some graphs annotate nodes, some annotate edges, some both. Sometimes two annotations are useful, even if they're just foreign keys for some external data. For example in finite state machines, an edge may be annotated with an input and an output. You're likely to be interested in determinism WRT the input but not the output, so having them both hidden behind a single ID is a pain - and that's besides the whole issue of secondary containers for the what-does-this-ID-refer-to lookups.

Also, sometimes you just want to work with things that weren't designed explicitly as a digraph IYSWIM - e.g. a bunch of data structure nodes that link to each other.

IOW, it's useful to be able to plug in different digraph representations, yet still use the same high-level tools for many operations.

IMO I got it right when I wrote a 'tree tool' class which does things like iterating and subscripting in treeview order and XML element tag order. The underlying tree representation is pluggable (policy based template). With digraphs, I messed up.

Anyway, I don't know what Boost or any other graph library actually provides, but if it provides what you need I say use it. It'll save a lot of headaches.

Steve314
+2  A: 

You may also take a try with LEMON graph library.

I could use it to perform Dijsktra shortest path search after reading the graph from a configuration file.

A new version has just come out.

rics
I second the suggestion of LEMON. I've been testing it out today and am impressed with the clean API design and good options for different graph implementations; dynamic graphs vs static, etc. I've scaled it up to 1 million nodes and 100 million edges so far with no performance issues, etc. Lastly, it compiled and build-tested cleanly on the three platforms I use (Solaris, Linux, and MacOS); always nice to see when using open source. LEMON is no lemon. ;-)
Joel Hoff
+3  A: 

This really isn't a concise answer, its more of adding to what has already been said.

The solution to this problem depends on the context of the problem. If you wish to get a great understanding how graph algorithms, and software works. Write your own. If you want to get an advanced knowledge of graph algorithms and structures or to implement them into your own program then learn a standard graph library. (See the reasons for using reusable code)

The best of both worlds. Do both. If you are stretched for time, get a book on this or follow tutorials and the examples.

Another suggestion: Ask a new pointed question about what is the {best, most reliable, easiest to learn} graph library for {describe a very general problem}? This will help you choose what to learn, rather than trying at random to find the best one that suits your needs. Someone has already seen this problem, ask for advice.

Using Reusable Code:

  1. Code that someone else has written including test cases will generally be more reliable than your own.
  2. Fixes are easer to implement (with updates to the component you are reusing), this is true in Boost (since they do frequent updates, and that Boost is peer reviewed).
  3. Once you learn one framework you can apply that to other applications... who knows you might get a job later in life using the framework. Companies like reusing rather than reinventing the wheel.
monksy
+7  A: 

I recently gave the boost graph library a trial for Dijkstras shortest path problem. Results:

  • Very High performance

  • Very little code needed

  • Very flexible and extensible

  • Hard to understand or debug

Advice: Use it but before you do so read The Boost Graph Library: User Guide and Reference Manual

RED SOFT ADAIR
Not too surprising - it's basically the general story with the C++ standard library which Boost seeks to extend.
Steve314
I would certainly *not* agree to say that the STL is hard to understand or debug. The STL itself is straight forward, very well defined and its even reasonable to debug. boost::graph is not. boost::spirit (parser library) even worse.
RED SOFT ADAIR
+2  A: 

Before deciding to build your own graph library, I would have a good look at igraph (http://igraph.sourceforge.net/). It is written in C and is extremely fast and offers a wider range of basic and advanced graph algorithms (including visualization). In addition, it has a python wrapper for fast coding so I think this solution combines speed of coding and speed of execution.

DrDee