views:

1432

answers:

1

I'm working on an application where data size and SQL queries are going to be heavy. I am thinking between Cassandra or Amazon SimpleDB. Can you please suggest which is more suitable in this kind of scenario?

Cassandra data indexing seems better than Amazon simpleDB, but the queries have fewer options compared to Amazon SimpleDB. Seems Amazon SimpleDB has heavy I/O rates.

Few of the complex use cases are user activities with different filters that user can put to narrow down to some interesting activities.

If you think there is anyother cleaner and better solution apart from these two, please suggest.

+4  A: 

Hi,

SimpleDB can only scale by sharding, has 10 GB data size limit per table, and query performance is parallel to record count (eg: poor if you have 1 million records). And google's datastore is slower than simpledb. Cassandra is much more scalable, high traffic sites began to use it, there is nothing better for free if you need high write rates with massive data. cassandra survey

If your read/write ratio is something like %90 for read and %10 for write, then terracotta or infinispan with postgres is a better fit. There some free clustering options for postgresql but none of them matured (mostly prototypes).

Another option is sharding. Hiberntae and NHibernate has sharding support. You can use them with postgres or mysql but you loose joins.

Regards

sirmak