views:

124

answers:

3

Hello,

I'm in the process of choosing database for my application. I have been using MySQL for the longest time but for my current application Performance and Scalability is important and I know MySQL has its limitation and I have been hearing a lot about key-value stores, column-based DBs and document-based DBs and others. I have looked into:

  • Cassandra
  • MongoDB
  • Redis
  • CouchDB

They all seem (or claim) to be faster than relational DBs such as MySQL.
I'm using Ruby on Rails and there are clients for all the above so it shouldn't be a problem.

My data model is simple for the most part which is centered on a user object(with rich profile and preferences) related to different items such as photos, videos, posts...etc and each one of these has one tag or more.

The fact that these databases are new there doesn't seem to be a lot of resources for them online. Plus they are in a way structurally different so it will not be trivial to switch from one to another later.

I wish you can give me your input on what DB you think would be most suit my application that will have good performance and scale. Thanks,

Tam

A: 

The primary benefit of something like a document database, at least for your app, is that you can treat the entire User glob of info as a single document. You don't have to worry about adding table for properties, or new features, or whatever, rather you can keep the bulk of it in the user document and update it dynamically.

For read often, write rarely, this works a treat.

Now you don't need a "document database" to do something like this. MySQL et al will work just fine with a primary key and a CLOB (text) / BLOB field to hold the document.

Where something like CouchDB (the one that I'm most familiar with in this space) can help is that it has well supported replication, and it's straightforward to create views on specific attributes of the documents (for example, you want all "premiere" members, or whatever).

Plus, since CouchDB is HTTP, it works well with the modern caches and such that are available, which can help you in scaling, especially in, again, read heavy operations.

A lot of this is more about overall architecture than actual tools, so make sure you consider that first.

Will Hartung
A: 

There is also Tokyo Cabinet which is used by some large sites.

I have not yet used on but my understanding is that when site like Twitter need to turn large numbers of messages round very quickly the overhead of the RDBMS is just to great and starts to slow the response times down significantly.

What you would need to do is look at the advantages you get from an RDBMS and weigh that against it's speed then do the same in reverse for a nosql type database.

RDBMS's give you a standard, they give you security, integrity and a general purpose language based on sets to make data manipulation easier. However if you do not need all or any of that structure you are loosing out on speed.

Prior to SQL was CODASYL and network databases. SQL took ove because of portability and transferability of skills etc. But i think the mobile wired world is changing this and it would be worth investigating.

PurplePilot
+3  A: 

Step 1) Create your design using whatever technology you are strongest with.

Step 2) Release your social network, begin on researching non-relational databases and master whichever you feel most comfortable with.

Step 3) Refactor your data tier so you could potentially replace MySQL quickly and easily with your newly learned DB technology.

Step 4) Wait for your website to become so big that the need to replace MySQL comes around and begin to plug the holes.

I know this seems kind of cheeky, but really my point is just release your software and start to worry about scale etc. when it actually becomes a concern.

steve
+1 especially for 1) and 2). Shipping is a feature more important than scalability.
Tadeusz A. Kadłubowski
Taking this approach too far is why software sucks.
Nick Bastin
And the exact opposite is why software projects fail, go over budget or run out of time. There is no perfect solution.
steve