views:

2275

answers:

6

I have a source directory eg /my/source/directory/ and a destination directory eg /my/dest/directory/, which I want to mirror with some constraints.

  • I want to copy files which meet certain criteria of the find command, eg -ctime -2 (less than 2 days old) to the dest directory to mirror it
  • I want to include some of the prefix so I know where it came from, eg /source/directory
  • I'd like to do all this with absolute paths so it doesn't depend which directory I run from
  • I'd guess not having cd commands is good practice too.
  • I want the subdirectories created if they don't exist

So

/my/source/directory/1/foo.txt -> /my/dest/directory/source/directory/1/foo.txt
/my/source/directory/2/3/bar.txt -> /my/dest/directory/source/directory/2/3/bar.txt

I've hacked together the following command line but it seems a bit ugly, can anyone do better?

find /my/source/directory -ctime -2 -type f -printf "%P\n" | xargs -IFILE rsync -avR /my/./source/directory/FILE /my/dest/directory/

Please comment if you think I should add this command line as an answer myself, I didn't want to be greedy for reputation.

+3  A: 

You could try cpio using the copy-pass mode, -p. I usually use it with overwrite all (-u), create directories (-d), and maintain modification time (-m).

find myfiles | cpio -pmud target-dir

Keep in mind that find should produce relative path names, which doesn't fit your absolute path criteria. This cold be of course be 'solved' using cd, which you also don't like (why not?)

(cd mypath; find myfiles | cpio ... )

The brackets will spawn a subshell, and will keep the state-change (i.e. the directory switch) local. You could also define a small procedure to abstract away the 'uglyness'.

roe
I've always felt a bit uncomfortable with cds in scripts. Maybe it's just being overnervous but I always like commands which work from whatever context and have no effect on state. I'm not a guru shell scriptet though, and this is probably unreasonable
Nick Fortescue
that's why I put it in brackets, that'll spawn a subshell and so won't have side effects outside the brackets. I'll add that...
roe
A: 

An alternative is to use tar, (cd $SOURCE; tar cf - .) | (cd $DESTINATION; tar xf -)

EDIT: Ah, I missed the bit about preserving CTIME. I believe most implementations of tar will preserve mtime, but if preserving ctime is critical, then cpio is indeed the only way.

Also, some tar implementations (GNU tar being one) can select the files to include based on atime and mtime, though seemingly not ctime.

Vatine
Nice answer, I hadn't thought of tar.
Nick Fortescue
actually, this doesn't manage the conditions and base on filetime
Nick Fortescue
@vatine: it was not, fortunately, preserving ctime but simply selecting based on ctime.
Jonathan Leffler
+1  A: 

IF you're using find always use -print0 and pipe the output through xargs -0; well almost always. The first file with a space in its name will bork the script if you use the default newline terminator output of find.

I agree with all the other posters - use cpio or tar if you can. It'll do what you want and save the hassle.

Adam Hawes
A: 
#!/bin/sh

SRC=/my/source/directory
DST=/my/dest/directory

for i in $(find $SRC -ctime -2 -type f) ; do
    SUBDST=$DST$(dirname $i)
    mkdir -p $SUBDST
    cp -p $i $SUBDST
done

And I suppose, since you want to include "where it came from", that you are going to use different source directories. This script can be modified to take source dir as an argument simply by replacing SRC=/my/source/directory, with SRC=$1

EDIT: Removed redundant if statement.

Does not work when filenames have whitespaces.

jandersson
I think mkdir -p will succeed if the directory already exists, so the if is bit redundant. You could use it in conjunction with an existence tests to make sure it's not a file though.
roe
Any white space or special characters in the file or directory names will bork this up. In scripts always quote such variables; e.g. mkdir -p "$SUBDST". Better to just use find and cpio as others have suggested instead of recreating a non-round wheel.
Dave C
quoting will not help at all, since it's all about the delimiter of the for loop (IFS). I agree in this case it's better to use cpio, but what if you want to implement custom logic based on a file name, and not just copy files as cpio does?
jandersson
+5  A: 

This is remarkably similar to a (closed) question: Bash scripting copying files without overwriting. The answer I gave cites the 'find | cpio' solution mentioned in other answers (minus the time criteria, but that's the difference between 'similar' and 'same'), and also outlines a solution using GNU 'tar'.

ctime

When I tested on Solaris, neither GNU tar nor (Solaris) cpio was able to preserve the ctime setting; indeed, I'm not sure that there is any way to do that. For example, the touch command can set the atime or the mtime or both - but not the ctime. The utime() system call also only takes the mtime or atime values; it does not handle ctime. So, I believe that if you find a solution that preserves ctime, that solution is likely to be platform-specific. (Weird example: hack the disk device and edit the data in the inode - not portable, requires elevated privileges.) Rereading the question, though, I see that 'preserving ctime' is not part of the requirements (phew); it is simply the criterion for whether the file is copied or not.

chdir

I think that the 'cd' operations are necessary - but they can be wholly localized to the script or command line, though, as illustrated in the question cited and the command lines below, the second of which assumes GNU tar.

(cd /my; find source/directory -ctime -2 | cpio -pvdm /my/dest/directory)

(cd /my; find source/directory -ctime -2 | tar -cf - -F - ) |
    (cd /my/dest/directory; tar -xf -)

Without using chdir() (aka cd), you need specialized tools or options to handle the manipulation of the pathnames on the fly.

Names with blanks, newlines, etc

The GNU-specific 'find -print0' and 'xargs -0' are very powerful and effective, as noted by Adam Hawes. Funnily enough, GNU cpio has an option to handle the output from 'find -print0', and that is '--null' or its short form '-0'. So, using GNU find and GNU cpio, the safe command is:

(cd /my; find source/directory -ctime -2 -print0 |
    cpio -pvdm0 /my/dest/directory)

Note:This does not overwrite pre-existing files under the backup directory. Add -u to the cpio command for that.

Similarly, GNU tar supports --null (apparently with no -0 short-form), and could also be used:

(cd /my; find source/directory -ctime -2 -print0 | tar -cf - -F - --null ) |
    (cd /my/dest/directory; tar -xf -)

The GNU handling of file names with the null terminator is extremely clever and a valuable innovation (though I only became aware of it fairly recently, courtesy of SO; it has been in GNU tar for at least a decade).

Jonathan Leffler
The OP said "mirror it" so `cpio` requires `-u`. I don't remember times when GNU utils didn't support `-0` option. Such command-line tools are like unsafe razor. It is easy to shoot yourself in the foot. Nice Answer!
J.F. Sebastian
Hmmm...I went and looked at GNU tar 1.13 source, dated 1999, and it is in there. Maybe 'fairly recent' means 'of which I became aware only fairly recently - thank you, SO'.
Jonathan Leffler
A: 

!/usr/bin/sh

script to copy files with same directory structure"

echo "Please enter Full Path of Source DIR (Starting with / and ending with /):" read spath

echo " Please enter Full Path of Destination location (Starting with / and ending with /):" read dpath

si=echo "$spath" | awk -F/ '{print NF-1}'

for fname in find $spath -type f -print do cdir=echo $fname | awk -F/ '{ for (i='$si'; i<NF; i++) printf "%s/", $i; printf "\n"; }' if [ $cdir ]; then if [ ! -d "$dpath$cdir" ]; then mkdir -p $dpath$cdir fi fi

cp $fname $dpath$cdir

done