views:

298

answers:

4

I'm trying to use rsync to backup MySQL data. The tables use the MyISAM storage engine.

My expectation was that after the first rsync, subsequent rsyncs would be very fast. It turns out, if the table data was changed at all, the operation slows way down.

I did an experiment with a 989 MB MYD file containing real data:

Test 1 - recopying unmodified data

  • rsync -a orig.MYD copy.MYD
    • takes a while as expected
  • rsync -a orig.MYD copy.MYD
    • instantaneous - speedup is in the millions

Test 2 - recopying slightly modified data

  • rsync -a orig.MYD copy.MYD
    • takes a while as expected
  • UPDATE table SET counter = counter + 1 WHERE id = 12345
  • rsync -a orig.MYD copy.MYD
    • takes as long as the original copy!

What gives? Why is rsync taking forever just to copy a tiny change?

Edit: In fact, the second rsync in Test 2 takes as long as the first. rsync is apparently copying the whole file again.

Edit: Turns out when copying from local to local, --whole-file is implied. Even with --no-whole-file, the performance is still terrible.

+1  A: 

rsync still has to calculate block hashes to determine what's changed. It may be that the no-modification case is a shortcut looking at file mod time / size.

Jason S
A: 

rsync uses an algorithim where it sees if a file has changed, and then sees what parts of it changed. In a large database it is common that your changes are spread throughout a large segment of the file. This is rsync's worst case scenario.

Ben Reisner
A: 

Rsync is file based. If you found a way of doing it with a block based system then you could just backup the blocks/bytes that had changed.

LVM snapshots might be one way of doing this.

James C
A: 

when doing local copies, rsync defaults to --whole-file for a reason: it's faster than doing the checks.

  • If you want the fastest local copy, you already got it.
  • If you want to see the rsync speedup, copy over the network. It's impressive, but won't be faster than a local full copy.

rsync for local copies is a nice replacement to cp when you have a big directory where only some files change. It'll copy those file whole; but quickly skip those not modified (just checking timestamps and filesize). For a single big file, it's no better than cp.

Javier