views:

525

answers:

3

Should I use cassandra in 100,000 users project ? In mysql 5 have full text search and partition table. I'm starting to make Question and answer system like stackoverflow with CodeIgniter. It's move from vbulletin to new system. In old vbulletin have around 100,000 users and total post is around 80,000. In next 3 or 4 year, users and posts will be more and more. So, Should I use cassandra instead of mysql 5 ?

If I use cassandra, I need to change gridserver in mediatemple to DV server in mediatemple. Cassandra is not built in hosting system. So, I must use VPS or DV server.

If I use mysql 5, hosting is not problem but how about speed and search. Btw, What database using in Stack Over ?

+5  A: 

From the information you provided, I would suggest to stick to MySQL.

Just as a side-note, Facebook was using MySQL at first, and eventually moved to Cassandra only after it was storing over 7 Terabytes of inbox data, for over 100 million users.

Wikipedia also handles hundreds of Gigabytes of text data in MySQL.

Daniel Vassallo
Thank. great! information for me.
saturngod
+6  A: 

You say 100,000 users - but how many concurrent users?

Cassandra is not built in hosting system

Using a hosted service on a single server suggests a very small scale operation - and your obviously limited by your budget. There's certainly no advantage running Cassandra on a single server node.

In mysql 5 have full text search

Which is not a very scalable solution - you should definitely think about using a normalized search (which I believe you'd have to do if you were migrating to Cassandra anyway).

Given that you can comfortably scale the MySQL solution to multiple databases using replication before you even think about fully clustered solution, and you obviously don't have the budget to do your own hosting, migrating to Cassandra seems like a massive overkill.

symcbean
thank. I will change to dv in the future. Now, I'm running on mediatemple gridserver. What is a normalized search ?
saturngod
+2  A: 

I would NOT recommend you using cassandra in your case for the following reasons:

  1. Cassandra needs good understanding of the application you're building. It will be much harder to make changes and to run complex queries against data stored in cassandra. SQL is more flexible and easier to maintain. Cassandra is good when you need to store huge amounts of data and when you know exactly how the data stored in cassandra will be accessed and sorted.

  2. Mysql works fine for millions of rows if properly indexes are built.

  3. If you hit some bottlenecks in the future with mysql, you may look at what exactly your problems are and scale them using cassandra. I mean you must be able to combine both approaches: SQL and noSQL in the same project.

With regards to mysql full-text index I can say that it's useless. I mean that it works too bad to be used in high-loaded projects. Look at sphinxsearch.com, which is a great implementation of full-text search made for sql databases.

But if you expect that your system grows fast and is going to serve millions of users, you should consider cassandra since the beginning.

Andriy Bohdan