views:

228

answers:

5

I want to copy an entire linux server that is going to be decommissioned over the network so we are sure nothing is lost.

I did du / and was told there are 60 GB of under /

Then I did rsync -r / root@newserver:/old-server and when doing du in the old-server dir I got 22 GB.

So why is that difference? Is there something that du can see but rsync can't copy?

+3  A: 

You probably have deleted files that can't yet be deallocated because there are open filehandles on them. (I didn't previously know that du would see the usage from those, but some testing showed that it does.) You can research this using lsof. The two main causes of this from my experience are deleting Apache logs without kicking the httpd and deleting mysql tables from the filesystem rather than by using DROP TABLE.

chaos
I think, this would affect df output, but not du.
Eugene Morozov
That's what I previously thought, as I'd only dealt with it via df previously. As I indicated in my answer, testing showed that I was wrong. du turns out to analyze disk usage at a pretty low level unless you tell it not to with --apparent-size.
chaos
--apparent-size has to do with sparse files, not deleted files. du only counts files it can see, and it can't see deleted files.
Lars Wirzenius
This is really sort of fascinating, how people keep 'correcting' me based on the behavior du has *in their imagination* as opposed to, y'know, trying it. Again, I have EXPERIMENTALLY VERIFIED that du will see the disk usage from a deleted file that has not been deallocated because of an open fd.
chaos
A: 

There're some special filesystems which you should avoid copying with rsync, for example, /proc, /sys, /tmp. They may account for the difference you see, although, it seems too big anyway.

There could be some unreadable directories (for example, without r or x on them). I don't remember whether process running with root rights can access such directories without fixing permissions first.

Better generate and compare list of files and their md5 sums.

Eugene Morozov
A: 

If you've got a bit of time on your hands, you could figure out exactly what the difference is: run cd /; find . > /tmp/old on the old server, cd /old-server; find . > /tmp/new on the new server, then vimdiff the two files to see what's changed.

David Wolever
A: 

may be You need dd your hard drives?

vitaly.v.ch
A: 

Some suggestions:

  • sparse files (use -S)
  • hard links (use -H)
  • /proc and /sys (use --exclude or, better, -x and backup each filesystem separately)

I tend to use rsync -axHSW --numeric-ids in similar circumstances.

Lars Wirzenius