views:

271

answers:

5

I am looking for a good way to transfer non-trivial (10G > x >10MB) amounts of data from one machine to another, potentially over multiple sessions.

I have looked briefly at

  • *ftp (sftp, tftp, ftp)
  • http
  • torrents (out because I will not have a seed network in general)
  • rsync (not sure if I can really adapt this to what I need)

Are there any other protocols out there that might fit the bill a little better? Most of the above are not very fault tolerant in and of themselves, but rather rely on client/server apps to pick up the slack. At this stage I care much more about the protocol itself, rather than a particular client/server implementation that works well.

(And yea I know I can write my own over udp, but I'd prefer almost anything else!!)

+1  A: 

BitTorrent doesn't require a big seed network to be effective - it'll work just fine with one seeder and one peer. There's a little bit of overhead setting up a tracker etc., but once set up it'd be a nice, zippy, fault-tolerant method of transferring.

ceejayoz
Hmm, I was under the impression that 1 seed and 1 peer devolved to essentially a ton of small 'ftp style' sessions which would increase the overhead on the seed machine with little value. I guess I need to go do more research!!
JT
+3  A: 

rsync is almost always the best bet.

since it transfers only differences, if the transfer is interrupted, the next time it won't be so different as the first one (when there wasn't a file at destination)

Javier
+4  A: 

I use rsync (over SSH) to transfer anything that I think might take more than a minute.

It's easy to rate-limit, suspend/resume and get progress reports. You can automate it with SSH keys. It's (usually) already installed (on *nix boxes, anyway).

Depending on what you need, rsync can probably adapt. If you're distributing to a lot of users, FTP/HTTP might be better for firewall concerns; but rsync is great for one-to-one or one-to-a-few transfers.

Peter Stone
rsync sounds like my first stop on the research train. I wasn't aware that it was quite that richly featured, rate-limit, progress reports, etc.
JT
Look at these options to get started: -avzP --stats --bwlimit
Peter Stone
+1  A: 

Well, HTTP is a good option, in that it supports restarting partial transfers by using byte ranges. FTP or TFTP are good because you can get server software that's extremely simple to configure, rather than having to lock down something like an HTTP server.

Mark Bessey
+1  A: 

GridFTP is what Argonne is using to transport huge amounts of data reliably.

plinth