views:

512

answers:

3

I'm writing an application that manipulates some sort of social network data, so the ideal underlying data structure is weighted directed graph. I'd like to do the manipulation (and searching) directly on the data, without first loading the entire graph into memory and serializing after.

This could be simulated using a standard SQL database, or key/value store, but that would be very inefficient (for the graph-traversal algorithms I'd like to use, e.g shortest path, etc.).

I'm half a mind to write my own since googling didn't turn up any useful results, but I'd much rather use an existing solution (if there is any and I missed it) than reinvent the wheel. The project is for fun / personal research, so the software would have to be open source (and prefferably capable of running under Linux).

So, are there any projects that would fit the above description?

Thanks!

A: 

You can also look at a graph as an array of nodes. Where each nodes stores a list of its siblings.

So you could simply store 1 file per node in your graph. Then the contents of that file is the list of nodes it is connected to (pointed to since directed).

Then you can read in a node as you need it.

This allows you to do things like iterate through the whole tree, while only keeping a single node in memory at a time.

Brian R. Bondy
Yeah, that's a possibility (amounting to implementing my own database in the way you described). TBH I'm worried about performance when reading tons of small files from the disk (should do some measurements first).DirectedEdge (http://is.gd/l5M1) does something similar (vectors in separate files).
I think if you make it like I mentioned, then you are open to a lot of optimizations like pre-fetch of connected nodes.
Brian R. Bondy
+1  A: 

What about an ODBMS? db40 has Java and .NET implementations, so both run on Linux.

Si
I've a feeling DBMS would also bring a lot of overhead in graph traversing (think calculating pagerank on a million objects), so I haven't really considered them. I didn't know db4o was available for Linux, thanks for the info, maybe it is a realistic option.
+4  A: 

If you're using Java you can try http://neo4j.org/

Jeramy Rutley
There's bindings to other languages as well (for now: Ruby, Python, Clojure), see http://neo4j.org/community/languages/
nawroth