views:

39

answers:

4

Hi

Does anyone know a more efficient way to keep a graph information (i.e. more efficient than keeping it as 2-D array), in terms of memory space or build time?

you can assume it's values are limited between 0-255.

Thanx!

+1  A: 

If the graph is fairly sparse then you'll save space by storing it as an edge list (from node, to node) rather than an adjacency matrix. If all edges are bidirectional then you only need store any edge once, of course.

Will A
+1  A: 

Here are a few standard ways to represent a (directed) graph:

For a graph of 4 nodes:

Adjacency matrix:

  0  1  2  3 
0 F  F  T  F
1 T  F  T  T
2 T  F  F  F
3 F  F  F  F

Adjacency List:

((0, 2), (1, 0), (1, 2), (1, 3), (2, 0))

[can't remember the name for this one]:

(
 (0, (2,)     ), 
 (1, (0, 2, 3)), 
 (2, (0,)     ),
 (4, (,)      ),
)

The Adjacency matrix is simple and the fastest representation, but takes the most memory (N*N, where N the number of rows), except if you have an extremely dense graph. You may be able to save some memory by using bit-arrays, if you only have a simple unweighted graph.

Adjacency List is simple, but slower than Adjacency Matrix, and is memory efficient if you have a sparse graph (2*M, where M is the number of edges).

The last one is slightly more complicated (as you need variable-size array), but more memory efficient than Adjacency List if you have a large number of edges (2*N+M, where N is the number of vertices, M the number of edges)

The Adjacency Matrix takes (N*N*b) space. Adjacency List takes ((2+b)*N) memory. And the last representation takes (2*N+M*(1+b)) memory.

If you know you always have less than 256 vertices (8-bit), and the weights are less than 256 (8-bit), then the Adjacency Matrix takes (N*N*8) space. Adjacency List takes (24*N) memory. And the last representation takes (16*N+M*16) memory.

Lie Ryan
A: 

If you don't need to modify your graph after creation, take a look at the compressed sparse row (CSR) format. A description from BGL:

The CSR format stores vertices and edges in separate arrays, with the indices into these arrays corresponding to the identifier for the vertex or edge, respectively. The edge array is sorted by the source of each edge, but contains only the targets for the edges. The vertex array stores offsets into the edge array, providing the offset of the first edge outgoing from each vertex. Iteration over the out-edges for the ith vertex in the graph is achieved by visiting edge_array[vertex_array[i]], edge_array[vertex_array[i]+1], ..., edge_array[vertex_array[i+1]]. This format minimizes memory use to O(n + m), where n and m are the number of vertices and edges, respectively. The constants multiplied by n and m are based on the size of the integers needed to represent indices into the edge and vertex arrays, respectively(...)

spenthil
A: 

Theres a list of graph representations I found useful here:

Graph Data Structures

Armand A