tags:

views:

300

answers:

6

I'm writing code for a graph.

Which is better for implementing a digraph with weighted edges?

set 1

class Node ;
class Edge
{
  Node *src, *dst ;
  int cost ;
} ;
class Node
{
  list<Edge> edges ;  // slightly redundant.
  // because an edge connects two nodes,
  // the information in each Edge's src
  // is redundant to this
} ;

set 2

// Each node has a map of nodes it connects to,
// with the value being the edge weight
class Node
{
  map<Node*, int> connections ;  // this is map<node, cost>
} ;

Of course, any other suggestions / approaches are appreciated as well!

A: 

What are you trying to do with your graph? Think about how you want to traverse it and how you might want to sort nodes or edges.

Moishe
It is for pathfinding, so I generally need to go from ((node I'm at)) to adjacent nodes.
bobobobo
A: 

You're missing a common one, edge-matrix representation. You use a 2d array and array[i][j] is the weight between i and j (or 0 if they're not connected).

rlbond
Doesn't that result in lots of zeroes (read: wasted space) if there are many nodes that aren't connected to each other (i.e. far from a fully meshed topology)?
Dan Moulding
I thought about this one, but it gets really expensive memorywise when the graph gets huge!
bobobobo
It's efficient if the graph is dense (most nodes are connected to one another directly) and inefficient if the graph is sparse (most nodes not connected to one another directly). It has the advantage of being quick for checking if a given connection exists ("is X connected to Y" is a single array lookup) but the disadvantage of taking longer on average to get a list of all the edges associated with a node (since you have to check a full row/column of the array, instead of just a list of actual edges.
Amber
+2  A: 

Implementation 1 is good for depth first search, Implementation 2 is good for breadth first search, Edge-matrix is good for doing things like computing duals and graph laplacians.

aterrel
Could you please explain why? Since I can efficiently implement both BFS and DFS in such a way that replacing `stack` with `queue` will turn DFS into BFS, the suitability of a representation is always the same for both of the algorithms.
avakar
Avakar, you are correct. I was thinking more the recursive view, but under the hood this is the same.
aterrel
A: 

All representations have their advantages and disadvantages depending upon applications. Here is a small compariosn between adjacency list and matrix

Space Trade off : For a sparse, adjacency matrix an adjacency list representation of the graph occupies less space, because it does not use any space to represent edges that are not present. Using a naive array implementation of adjacency lists on a 32-bit computer, an adjacency list for an undirected graph requires about 8e bytes of storage, where e is the number of edges: each edge gives rise to entries in the two adjacency lists and uses four bytes in each.

Besides the space trade-off, the different data structures also facilitate different operations. It is easy to find all vertices adjacent to a given vertex in an adjacency list representation; you simply read its adjacency list. With an adjacency matrix you must instead scan over an entire row, taking O(n) time. If you, instead, want to know if two given vertices have an edge between them, this can be determined at once with an adjacency matrix, while requiring time proportional to the minimum degree of the vertices with the adjacency list.

atv
+1  A: 

It is difficult to give a good answer, since the two representations both have advantages. The former is better if you want to store edges outside the graph and still be able to retrieve their source. The latter implementation will be more suitable for checking whether two verteces are connected:

bool connected(Node * p, Node * q)
{
    return p->connections.find(q) != p->connections.end();
}

As a side note, in the former version, I see no reason to use list instead of vector.

avakar
`vector`s store their elements in contiguous storage locations, which might not be what you wan't.
Georg Fritzsche
gf, I couldn't find a usage which would be consistent with both representations and in which list would make more sense than vector. The graph is not ordered (since the second version uses map), therefore insert/delete can be done efficiently even in the middle of the vector. With vector you get better cache locality and operator[]. I challenge you to prove me wrong. :)
avakar
A: 

There is a number of differences, which depend on algorithms being used and the graphs being stored:

  1. For sparse graphs and for generic search algorithms representation 1 will be better. Many algorithms have E (the number of edges) in their complexity estimate (like O(V+E) for DFS). This E turns into if you choose representation 2.

  2. The key advantage of representation 2 over rep. 1 is the ability to check whether x is directly connected to y quickly. If you need such functionality, consider using adjacency matrix.

  3. Some algorithms (like Floyd-Warshall) require operating on adjacency matrix. If you run one of them, representation 2 would be reasonable.

Pavel Shved