Efficient copy of entire directory

views:

116

answers:

+2 Q:

Efficient copy of entire directory

I want to copy one directory and the two files under it to another shared location of shared storage. Is it possible to combine the three(one directory and two files) as a continuous file writing and decompose it at another side to save the cost? I am limited to c language and Unix/Linux. I am considering to create a structure with the inode info and get the data at receiver.

Thanks!

+5 A:

rsync is what you're looking for. Or tar if you feel like working with the shell on the other side.

Ignacio Vazquez-Abrams 2010-04-05 03:22:09

The best optimization you can do is to use large buffers for the copy. If that is not enough then restructure your data to be a single file instead of two files in a directory. Next step is to get faster hardware.

There are many file systems in common use for Unix/Linux and you would need to write a custom copy algorithm for each. There is rarely a guarantee of contiguous blocks for even a single file, let alone two. Odds are also good that your block copy routine would bypass and be less efficient than existing file system optimizations.

Reading an entire file into memory before writing it out will give more benefit in terms of minimizing seek times than opening fewer files would, at least for files over a certain size. And not all hardware suffers from seek times.

drawnonward 2010-04-05 03:58:58

Thank you! This is exactly what I want to do. Do you have some hints of doing this? We can ignore the directory and only consider the case of two files, a, and b. I will first mmap file a and b to a continuous memory. then copy the entire buffer the the destination. But in this case, I will only get one big file at the destination. I am not aware about how to rebuild the one big file into two files at a low cost. Do you have some hint? I guess we can probably reset the beginning offset of the file, but not sure how to do that. Thanks,

Hui Jin 2010-04-05 18:21:24

If you do not need to preserve metadata about the files, like names and dates, just write the length of each file before the contents. If you do need to preserve metadata, use a standard archive format like tar or zip with no compression.With exactly two files, you could just write out the offset of where the second file starts.

drawnonward 2010-04-06 11:28:16

For some reason, cpio is often preferred over tar for this.

You can, for example, pipe cpio to a ssh session running cpio remotely.

Ben Voigt 2010-04-05 04:42:23

Do you mean s/svn/ssh/? Anyhow, tar can do this too.

ephemient 2010-04-05 14:24:08

yes of course. ssh.

Ben Voigt 2010-04-05 17:50:08

ansaurus

tags:

views:

answers:

Efficient copy of entire directory

related questions