views:

130

answers:

6

How do I know when a project is just to big for MySQL and I should use something with a better reputation for scalability?

Is there a max database size for MySQL before degradation of performance occurs? What factors contribute to MySQL not being a viable option compared to a commercial DBMS like Oracle or SQL Server?

+2  A: 

Google uses MySQL. Is your project bigger than Google?

Smart-alec comments aside, MySQL is a professional level database application. If your application puts a strain on MySQL, I bet it'll do the same to just about any other database.

DexterW
interesting enough "google" does not exist as "bigge than google". Google uses a lot of technoliges in many things. According to your smart assness, I am sure GOogle is "just osing MS Access" (SOMEWHERE I am sure I find an access database in a company the size of google).
TomTom
+2  A: 

If you are looking for a couple of examples:

Daniel Vassallo
Hardly a plus for MySql - I handle hundreds of gigabytes financial data in sql server without sweating ;) Modern hardware is SO damn powerfull.
TomTom
I've personally gotten thousands of queries per second through MySQL. In addition, Percona has great commercial solutions in addition to Oracle (which owns MySQL).
Adam Nelson
Dan: who was having the issues? Facebook or the Cassandra project?
Nitrodist
@Nitrodist: No, Facebook before they moved their inbox stuff to Cassandra.
Daniel Vassallo
+1  A: 

MySQL is a commercial DBMS, you just have the option to get the support/monitoring that is offered by Oracle or Microsoft. Or you can use community support or community provided monitoring software.

Nitrodist
+1  A: 

Things you should look at are not only size at operations. Critical are also:

  • Scenaros for backup and restore?
  • Maintenance. Example: SQL Server Enterprise can rebuild an index WHILE THE OLD ONE IS AVAILABLE - transparently. This means no downtime for an index rebuild.
  • Availability (basically you do not want to have to restoer a 5000gb database if a server dies) - mirroring preferred, replication "sucks" (technically).

Whatever you go for, be carefull with Oracle RAC (their cluster) - it is known to be "problematic" (to say it finely). SQL Server is known to be a lot cheaper, scale a lot worse (no "RAC" option) but basically work without making admins want to commit suicide every hour (the "RAC" option seems to do that). Scalability "a lot worse" still is good enough for the Terra Server (http://msdn.microsoft.com/en-us/library/aa226316(SQL.70).aspx)

THere wer some questions here recently of people having problems rebuilding indices on a 10gb database or something.

So much for my 2 cents. I am sure some MySQL specialists will jump in on issues there.

TomTom
+2  A: 

I work for a very large Internet company. MySQL can scale very, very large with very good performance, with a couple of caveats.

One problem you might run into is that an index greater than 4 gigabytes can't go into memory. I spent a lot of time once trying to improve the MySQL's full-text performance by fiddling with some index parameters, but you can't get around the fundamental problem that if your query hits disk for an index, it gets slow.

You might find some helper applications that can help solve your problem. For the full-text problem, there is Sphinx: http://www.sphinxsearch.com/

Jeremy Zawodny, who now works at Craig's List, has a blog on which he occasionally discusses the performance of large databases: http://blog.zawodny.com/

In summary, your project probably isn't too big for MySQL. It may be too big for some of the ways that you've used MySQL before, and you may need to adapt them.

David M
An index more than 4G CAN fit into memory. You may be referring to an ancient (and in any case configurable) limitation of MyISAM. Full-text indexes are, however, pretty much useless in mysql - because they're only supported on MyISAM and don't have very good features.
MarkR
A: 

Mostly it is table size.

I am assuming here that you will use the Oracle innoDB plugin for mysql as your engine. If you do not, that probably means you're using a commercial engine such as infiniDB, InfoBright for Tokutek, in which case your questions should be sent to them.

InnoDB gets a bit nasty with very large tables. You are advised to partition your tables if at all possible with very large instances. Essentially, if your (frequently used) indexes don't all fit into ram, inserts will be very slow as they need to touch a lot of pages not in ram. This cannot be worked around.

You can use the MySQL 5.1 partitioning feature if it does what you want, or partition your tables at the application level if it does not. If you can get your tables' indexes to fit in ram, and only load one table at a time, then you're on a winner.

You can use the plugin's compression to make your ram go a bit further (as the pages are compressed in ram as well as on disc) but it cannot beat the fundamental limtation.

If your table's indexes don't all (or at least MOSTLY - if you have a few indexes which are NULL in 99.99% of cases you might get away without those ones) fit in ram, insert speed will suck.

Database size is not a major issue, provided your tables individually fit in ram while you're doing bulk loading (and of course, you only load one at once).

These limitations really happen with most row-based databases. If you need more, consider a column database.

Infobright and Infinidb both use a mysql-based core and are column based engines which can handle very large tables.

Tokutek is quite interesting too - you may want to contact them for an evaluation.

When you evaluate the engine's suitability, be sure to load it with very large data on production-grade hardware. There's no point in testing it with a (e.g.) 10G database, that won't prove anything.

MarkR