views:

885

answers:

4

Is it possible to specify a time range so that rsync only operates on recently changed files.

I'm writing a script to backup recently added files over SSH and rsync seems like an efficient solution. My problem is that my source directories contain a huge backlog of older files which I have no interest in backing up.

The only solution I've come across so far is doing a find with ctime to generate a --files-from file. This works, but I have to deal with some old installations with versions of rsync that don't support --files-from. I'm considering generating --include-from patterns in the same way but would love to find something more elegant.

A: 

How about creating a temporary directory, symlinking or hardlinking the files in, then rsyncing that?

Hasturkun
+2  A: 

Why not just take the heat on backing up the whole directory once and take advantage of the incremental backing up provided by rsync and rdiff and its cousins, you won't waste diskspace where they are backed up to because they'll be perpetually unchanged.

Backing up the whole thing is simpler, and has substantially less risk for errors. Trying to selectively backup some files and not others is a recipe for not backing up what you need without realizing it, then getting burned when you can't restore a critical file.

Otherwise you should reorganize your source directory so there is less 'decision making' in your backup script.

whatsisname
I would normally agree about the risk of errors, but I'll never have any use for the older files (logs and other records which will never change). I would just take the heat but the thought of having to download and regularly reprocess several gigabytes of unwanted bloat is what prompted this question in the first place.Reorganisation is probably the solution - I can't change the existing structure but I can set up a temporary directory as Hasturkun suggested.
Ken
+3  A: 

It looks like you can specify shell commands in the arguments to rsync (see Remote rsync executes arbitrary shell commands)

so I have been able to successfully limit the files that rsync looks at by using:

rsync -av remote_host:'$(find logs -type f -ctime -1)' local_dir

This looks for any files changed in the last day (-ctime -1) and then rsyncs those into local_dir.

I'm not sure if this feature is by design but I'm still digging into the documentation.

Ken
A: 

May I suggest you drop rsync and look at rdiff-backup?

jlouis
Thanks I'll take a look - I looked at it previously but the CIFS compatibility issue put me off. ( http://rdiff-backup.nongnu.org/FAQ.html#cifs )
Ken