views:

3386

answers:

5

I'm working on a project that is considering using Cassandra as a database. We would like to eventually migrate to Cassandra even if we use MySQL to start with, given its scalability. I know that big companies like Facebook, Digg, and recently Twitter is using Cassandra, but I don't believe any of those sites run off Rails. My question is whether or not it's feasible to use Cassandra using Ruby on Rails. Points to consider:

  1. We heavily rely on the Authlogic gem. Would switching to Cassandra affect how it works?
  2. Are there any mature ruby clients for Cassandra? Looking on Github it seems that fauna's client is the most mature. Has anyone had production experience with it?

Appreciate any tips.

+8  A: 

Twitter is running rails on most of their front end. Fauna's client is actually built and released by twitter, so you can be pretty certain that it's up to date and stable on large workloads. Looking at the history of commits shows that there are frequent improvements being pushed to it, which is great.

Most likely Authlogic would need to be customized to work properly with Cassandra. In particular, it appears to provide certain methods based on named_scope and relational data.

It does appear that someone has built a plugin for DataMapper support in Authlogic: http://twitter.com/collintmiller/statuses/2064046718. You may be able to use that as a starting point for making it compatible with Cassandra.

Good luck!

Gdeglin
Thanks. The clarification about Fauna is very helpful, and I'll definitely look into the DataMapper plugin.
funkymunky
Another option worth considering: have your rails app use BOTH mysql and cassandra. This way your users table (amongst others) could stay on MySql with AuthLogic while your high volume tables could go to cassandra. I haven't actually tried this yet but it sounds possible from what I've seen.
Brian Armstrong
+3  A: 

I don't think starting with MySQL and then moving to Cassandra is a good idea.

Cassandra is a NoSQL solution, while MySQL is a "classic" SQL-driven database.

This means that your models would be different.

If you start with MySQL, you will have to rely on ActiveRecord for creating your models. If you then change to Cassandra, you will have to change all your models to a NoSQL-compatible middleware (such as BigRecord). This not only means changing your models, but also the controllers that use them (since their interface would be different).

This said, Cassandra and the like are supposed to be used on very demanding applications - like twitter.

The rest of web applications out there are orders of magnitude less intense - are you sure you still would need Cassandra?

PostgreSQL, and a well-designed database, is just good enough 98% of the time.

egarcia
Totally agreed. NoSQL is a cool and exciting technology. If you need to scale something big cheaply. Bug you need something big to scale first.
Jeduan Cornejo
+1  A: 

There is also http://github.com/NZKoz/cassandra_object, which IIANM builds on top of the fauna client. "Cassandra Object provides a nice API for working with Cassandra. CassandraObjects are mostly duck-type compatible with ActiveRecord objects so most of your controller code should work ok... Use this in production only if you're looking to help out with the development, there are a bunch of rough edges right now."

jbellis
+1  A: 

I'm researching Cassandra, MongoDB, and CouchDB right now.

One way to tell which has the most developer support, is by checking the number of watchers on the highest rated github project for each. At least as a rough estimate.

Right now it's

852 - MongoDB http://github.com/jnunemaker/mongomapper

544 - CouchDB http://github.com/jchris/couchrest

178 - Cassandra http://github.com/fauna/cassandra

Although, I have to say with a bunch of high profile sites (Twitter, Digg, Reddit, etc) recently announcing that they're moving to Cassandra, this is a big vote of confidence for them.

Mongo seems to have the most and best documentation so far. Their auto-sharding is still in alpha though so how well it scales still remains to be seen I think.

I'm just starting to learn about all this stuff, so if others have insight please share.

Brian Armstrong
This is a bit skewed as Mongo has a lot of use-cases while something like Cassandra is only going to be used by people that have specific needs, in this case high performance etc.
Bitterzoet
+4  A: 

If you then change to Cassandra, you will have to change all your models to a NoSQL

This isn't true at all. If you have programmed in such a way that your MySQL db does loads of joins, then yes, you may have a problem. We avoided joins as much as we could from the beginning when we started the MySQL route. Then when we started migrating to Casandra it was fairly easy, we did so with 1 model only at first. Then say 4 models in one go. Etc. Works well. In fact, when you read the interview with twitter you'll notice they ran MySQL and Casandra in parallel for the same model for a while: http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king.

As to Authlogic, you can keep that part in mySQL for as long as you like, just keep it loosely coupled with your Cassandra data.

Jim Soho
Thanks for the link it's very interesting. We're sticking with MySQL for now because we know 100% it works with our app. It's good to know that the switch to Cassandra is possible in the future though.
funkymunky