+9  A: 

Cassandra and the other distributed databases available today do not provide the kind of ad-hoc query support you are used to from sql. This is because you can't distribute queries with joins performantly, so the emphasis is on denormalization instead.

However, Cassandra 0.6 (beta officially out tomorrow, but you can build from the 0.6 branch yourself if you're impatient) supports Hadoop map/reduce for analytics, which actually sounds like a good fit for you.

Cassandra provides excellent support for adding new nodes painlessly, even to an initial group of one.

That said, at a few hundred writes/minute you're going to be fine on mysql for a long, long time. Cassandra is much better at being a key/value store (even better, key/columnfamily) but MySQL is much better at being a relational database. :)

There is no django support for Cassandra (or other nosql database) yet. They are talking about doing something for the next version after 1.2, but based on talking to django devs at pycon, nobody is really sure what that will look like yet.

jbellis
Thx for the answer! Couple of points - when you say the emphasis is on denormalization, that would basically imply that any "joins" that need to be done happen at the app level, but cassandra in effect distributes the query (assuming you use Random Partitioning)? Secondly - I guess I'm at a few hundred writes right now, but would much rather switch to a K-V store at this point than have to do it with a few 100k writes :) And lastly - even assuming that Django-NOSQL support still doesn't exist, is there anything that prevents real time querying of the Cassandra db through a REST API?
viksit
Cassandra routing is based on row key, so any query against a single row only has to hit one machine and is quite performant.A REST client api is a poor fit for Cassandra since it allows binary data, but more broadly, there's nothing stopping you from using the normal Python driver from django manually.
jbellis