views:

364

answers:

3

Do really large (say, 100 million rows) databases regularly get broken up? I'm totally unfamiliar with scaling -anything-, but thought that maybe one database could act as a key to larger chunks of the actual data, becoming an index that would say "Oh yeah, that query needs the 20,000,000-25,000,000 block" and sends the query into that block, skipping the rest. This is not something I need to implement, but thought I would ask here for resources pointing me in the right direction. Regards.

A: 

The biggest issue with large tables is how much data you are FETCHING with your queries. Good use of the LIMIT clause becomes paramount when working with post 1 million row tables (to the point that if you try to select the whole thing you will all but kill the server, I made that mistake exactly once).

Between indexes and LIMIT clauses, MySQL tables can be very performant with very large datasets stored, as long as you aren't trying to access the whole dataset at once.

Matthew Scharley
+2  A: 

They certainly can be. I know, for instance, that large systems often have multiple datacenters that contain pieces of their database and load balance between them.

From my understanding, a key concept to dividing up a large database like you're talking about is database sharding. This article should help you get a better idea of how sharding divides up a database for easier access and to reduce load on individual servers.

I once thought I needed this type of technique for a database of about a million rows, but as I learned 3NF and proper database structure I repaired my horribly implemented database. Thanks for reminding me of this! I'll be interested to see what other answers you get.

Dean Putney
A: 

MySQL 5.1 has the table partitioning feature which can break a table into sections as you describe (at a basic level). However, MySQL has whole range of performance tuning features, you need to be familiar with all of them in order to choose the best one for each installation.

too much php