I am currently developing a proof of concept for an alternative data store. The reason why is I need to enhance a read-mostly clustered webapp, but also because I want to free myself from the pain of the sometimes overly-complex ORM+RDBMS solution.
Overall the idea is quite similar to a distributed cache with persistence (letting the cluster be the SoR), however:
- want to be able to retrieve any object along with its children, by id (providing class & id) [only that to start off, as the main querying part is already resolved with lucene in my app].
- need to have map of maps of types ( ~ tables in the relational world), and therein distributed maps of 'dehydrated' stored objects (flattening the object graph via reflection deep cloning)
- a bin log (like Prevayler, for example) for
- eventual recovery if whole cluster goes down
- development (and ability to refactor code / change structure)
- perhaps asynchronously processed for other purposes (reporting, whatever)
- eventually later on try to integrate a statically-typed query mechanism, like LINQ, Jaque or H2's JaQu / see ODBs / Lucene (?)
- it has to be transaction-aware (not sure "JTA type" though)
I'm planning to implement this idea with Hazelcast (I love its super-simple API) or Terracotta (which I never used - but I'm aware of their 'sweet spot', medium-term data). If you will, my aim is to do more or less what Jonas once blogged about. Using one of these, stored data would roughly have to fit in the sum of the JVM heaps of the cluster.
This should be pretty simple to scale, would avoid the relational impedance mismatch (ie save as with an ODB) and JDBC + I/O overhead.
Do you know of other tools/frameworks or combination thereof already providing similar functionality, that I'm ignoring? Can you suggest other ways of tackling this 'getting rid of the DB'? What flaws do you already see in this idea? Concurrency-wise would it make sense to consider Scala instead of Java?
How about non-relational data stores such as Couch DB, Neo4j, HyperTable, HBase?
A similar question was asked one month ago - but there was no concrete solution.
BTW I just stumbled upon the concept of Enterprise Data Fabric, which, to my surprise, describes a lot of these ideas.