views:

183

answers:

2

I'm building an application and need a data structure of interconnected objects that can be queried and traversed. The connections between objects can be arbitrary and not necessarily known before hand. I need this data structure to be queryable (what usual SQL provides) and also traversable (what new graph database like neo4j provide). I'm trying to get something that does both and works with very large data sets efficiently. Let's call this data structure dao. I would need the following primitive methods:

// dealing with the objects
dao.save(s);
Something s = dao.load(Something.class, 5);
dao.update(s);
dao.delete(s);

// dealing with the relations
dao.relate(s, t);
dao.unrelate(s, t);

// the tricky methods
dao.querier(Something.class).filter(...).sort(...).values();
dao.traverser(Something.class).start(s).path(...).filter(...).sort(...).values();

The filter would be something like the sql where clause, the sort would be something like the sql order clause, the start would be the starting node for the traversal, and the path would define things like BFS and DFS traversing as well as when to stop searching.

I've tried to model this as vertices with an adjacency list but there must be a better way. Any ideas?

A: 

maybe using the sparql query engine for neo4j?

jspcal
A: 

Yes, Neo4j would be a good option. Beside the raw usage from Java, Jo4neo provides annotation-based persistence for your object model onto the graph. For querying, you either can use the Neo4j high speed Java Traversers, or use things like the JRuby Wrapper that does provide very convenient abstractions for queries from JRuby. Also, Gremlin is specialized on deep graph traversals, but not yet optimized for speed.