views:

51

answers:

2

I want to store a graph of millions of nodes where each node links to another in an undirected manner (point A to B, automatically B points to A). I have examined Neo4j, OrientDB as possible solutions but they seem to be oriented in directed graphs, and Neo4j not being free for >1 million nodes is not a solution for me.

Can you help me which of the other NoSQL DBs (Redis, CouchDB, MongoDB, ...) would suit best for something like this and how could it be implemented? I want to make a no-property (just give me the linked elements) breadth-first queries with 2 depth levels (having A<->B, B<->C, C<->D, querying A should give me B and C, but not D).

+1  A: 

Neo4j always stores relationships/edges as directed, but when traversing/querying you can easily treat the graph as undirected by using Direction.BOTH or in some cases by not defining a direction at all. (This way there's no need for "double" edges to cover both directions, you simply ignore the direction - and there's no performance penalty when traversing edges "backwards".)

The 1 million "primitives" limit was removed for quite a while now. If your code is open source, you can use the community version for any size of the DB. For other cases there's the commercial versions which includes one free alternative.

nawroth
+1  A: 

OrientDB has no limitation on the number of nodes. Furthermore the default model is bi-directional. You can use it for FREE also for commercial purposes, since the applied license is Apache 2.

The GraphDB is documented here: http://code.google.com/p/orient/wiki/GraphDatabase. Basilary you can use the native API or the Blueprints implementation. Native APIs has an evolution of the SQL language with special operators for graphs. Example:

SELECT FROM Account WHERE friends TRAVERSE (1,7) (address.city.country.name = 'New Zealand')

That means give me all the accounts with such friend that lives in New Zealand. Friends are taken up to the 7th level of deep.

The second one allows to use the full Blueprint stack such as the Gremlin language to create your super-complex queries.

Lvca