views:

34

answers:

1

I am working on what is currently a pet project. Soon it will be going into mainstream production. My biggest barrier is the data storage.

The bulk of the data is "document" with specific indexes that would span across several types of data. So a single collection with indexing would work just fine.

I know MongoDB, Caché, and M will handle this brilliantly.

However, the data in the indexes is heavily hierarchical, i.e., graph. A specific example is the hierarchy of geographical location. To think with this, consider the question "where is a product available?" Local, city, region, state, national, or international level? I know Neo4J will handle this part with ease.

Data also needs to be queried geospacially. I know SQL Server 2008 can handle this. Neo4J definitely can't.

From a performance standpoint, I built a prototype in SQL Server, where I had to rig the hierarchy using a parent-child table (it was correctly designed and indexed). For my test on roughly 1.2 million items, it took 3 minutes for the first query to execute. The final execution was relatively fast (1.2ms) once the indexes were hot. I know Neo4J also heavily depends on hot indexes.

I have considered building a RAM index engine using LINQ and using something like MongoDB as a data store and query helper. However, this adds to the project dev time, and is, of course, no small venture.

Any suggestions?

+1  A: 

It's Java, but you might be interested in HyperGraphDB.

Gates VP