views:

48

answers:

3

I have 200GB / 400Mrows mysql/innodb database - far beyond what's reasonable as I found out.

One surprising problem is restoring backups. mysqldump generates huge sql files, and they take about a week to import back into a fresh database (attempts at making it faster like bigger/smaller transactions, turning off keys during import etc., network compression etc. failed so far, myisam import seems 2x faster but then there would be no transactions).

What's worse - and I hope to get some help with this - a network connection which transfers >200GB over a time period of a week has a non-trivial chance of breaking, and sql import process cannot be continued in any non-trivial way.

What would be the best way of dealing with it? Right now if I notice a broken connection I manually try to figure out when it ended by checking highest primary key of the last imported table, and then have a perlscript which basically does this:

perl -nle 'BEGIN{open F, "prelude.txt"; @a=<F>; print @a; close F;}; print if $x; $x++ if /INSERT.*last-table-name.*highest-primary-key/'

This really isn't the way to go, so what would be the best way?

+1  A: 

Does your MySQL box have enough hard drive space for all the data doubled? Local storage would be best here, but if it's not an option, you could also try some sort of NAS device utilizing iSCSI. It's still happening over the network, but in this case you get more throughput and reliability, because you're only relying on a NAS which has a pretty slim OS and almost never has to be rebooted.

Scott Anderson
It's all on Amazon cloud. There were no crashes so far, but I don't feel terribly happy about the fact that if there ever is one, it will take at least a week to restore backups + random ad hoc perl scripts if it fails halfway through for any reason.
taw
A: 

Do you need everything in the database?

Can you push some of the information to an archive database and add something into your application that would allow people to view records in the archive,

Obviously this depends a lot upon your application and set up, but it may be a solution? Your DB is probably only going to get bigger....

Matt
In the long term it will be necessary to fix it, but right now I'm just trying to keep it all running properly.
taw
A: 

You can't use mysqldump to backup large databases - 200G is feasible but bigger ones it gets worse and worse.

Your best bet is to take a volume snapshot of the database directory and zip that up somehow - that's what we've generally done - or rsync it somewhere else.

If your filesystem or block device does not support snapshots then you're basically in trouble. You can shut the db down to take a backup, but I don't imagine you want to do that.

To restore it, just do the opposite then restart and wait (possibly some time) for innodb recovery to fix things up.

The maatkit mk-parallel-dump and restore tools are a bit better than mysqldump, speed-wise - but I'm not 100% convinced of their correctness


Edit: re-reading the question, I think filesystem snapshot + rsync is probably the best way to go; you can do this without impacting the live system (you'll only need to transfer what changed since the last backup) too much and you can resume the rsync if the connection fails, and it'll continue where it left off.

MarkR