views:

5866

answers:

4

I am evaluating what might be the best migration option.

Currently, i am on a sharded mysql (horizontal partition), with most of my data stored in json blobs. I do not have any complex SQL queries( already migrated away after since I partitioned my db)

Right now, it seems like both Mongodb and Cassandra would be likely options. My situation

  • lots of reads in every query, less regular writes
  • not worried about "massive" scalability
  • more concerned about simple setup, maintenance and code
  • minimize hardware/server cost
+4  A: 

I haven't used Cassandra, but I have used MongoDB and think it's awesome.

If your after simple setup, this is it. You simply untar MongoDB and run the mongod daemon and that's it..it's running.

Obviously that's only a starter, but to get you started it's easy.

dalton
+1 for mongo. I put together a simple app that used mongodb in about 20 minutes. It couldn't have been easier.
Chris Lively
+2  A: 

I saw a presentation on mongodb yesterday. I can definitely say that setup was "simple", as simple as unpacking it and firing it up. Done.

I believe that both mongodb and cassandra will run on virtually any regular linux hardware so you should not find to much barrier in that area.

I think in this case, at the end of the day, it will come down to which do you personally feel more comfortable with and which has a toolset that you prefer. As far as the presentation on mongodb, the presenter indicated that the toolset for mongodb was pretty light and that there werent many (they said any really) tools similar to whats available for MySQL. This was of course their experience so YMMV. One thing that I did like about mongodb was that there seemed to be lots of language support for it (Python, and .NET being the two that I primarily use).

The list of sites using mongodb is pretty impressive, and I know that twitter just switched to using cassandra.

GrayWizardx
+18  A: 

Lots of reads in every query, less regular writes

A read heavy workload will be more tailored to MongoDB. One of Cassandra's biggest strengths is it's linear scaling of writes. It still does reads relatively efficiently, but MongoDB wins in this regard with flexible querying etc.

Not worried about "massive" scalability

Sounds like MongoDB's sharding (presently in alpha, be warned) will cover you enough.

More concerned about simple setup, maintenance and code

MongoDB has simpler set up, though both are pretty simple (cassandra just requires a small amount of file based config). If you're presently using JSON blobs, mongo is an insanely good match for your use case, given it uses BSON to store the data. You'll be able to have richer and more queryable data than you would in your present database.

Since MongoDB can take care of your indexes etc. (while you'd probably have to roll your own secondary index generation for Cassandra), and you're using JSON blobs currently, the reduction in maintenance and code would be the most significant win for Mongo.

Minimize hardware/server cost

Both are efficient for what they achieve, but I believe you'll get more bang for your buck with your read heavy workload based off MongoDB.

Edit:

My experience is just based off doing toy apps in both and I avidly follow both platforms (as imho they're the best of their respective domains).

Michael
What do you mean by "respective domains" - would you consider them as seperate types? thanks for the great replies!
ming yeow
Michael
while Cassandra is lower level but allows for uber scaling (see Twitter/Digg/Facebook), but you're going to have to be deliberate in how you lay your data out, build secondary indexes etc, since no flexible querying is allowed.
Michael
Cassandra you will get similar read performance if setup is not using multiple nodes in cluster, just having 3 node with replication factor of 3 will give you similar performance as all node has all data. so performance factor can't be compared with mongodb like apple to apple
mamu
+5  A: 

I've used MongoDB extensively (for the past 6 months), building a hierarchical data management system, and I can vouch for both the ease of setup (install it, run it, use it!) and the speed. As long as you think about indexes carefully, it can absolutely scream along, speed-wise.

I gather that Cassandra, due to its use with large-scale projects like Twitter, has better scaling functionality, although the MongoDB team is working on parity there. I should point out that I've not used Cassandra beyond the trial-run stage, so I can't speak for the detail.

The real swinger for me, when we were assessing NoSQL databases, was the querying - Cassandra is basically just a giant key/value store, and querying is a bit fiddly (at least compared to MongoDB), so for performance you'd have to duplicate quite a lot of data as a sort of manual index. MongoDB, on the other hand, uses a "query by example" model.

For example, say you've got a Collection (MongoDB parlance for the equivalent to a RDMS table) containing Users. MongoDB stores records as Documents, which are basically binary JSON objects. e.g:

{
   FirstName: "John",
   LastName: "Smith",
   Email: "[email protected]",
   Groups: ["Admin", "User", "SuperUser"]
}

If you wanted to find all of the users called Smith who have Admin rights, you'd just create a new document (at the admin console using Javascript, or in production using the language of your choice):

{
   LastName: "Smith",
   Groups: "Admin"
}

...and then run the query. That's it. There are added operators for comparisons, RegEx filtering etc, but it's all pretty simple, and the Wiki-based documentation is pretty good.

Richard K.