views:

323

answers:

2

I need to several some huge files (several gigs) from Java via FTP/HTTP. Is there a ready library (java / command line tool) to facilitate the download? Some obvious requirements are:

  1. Multi-connection download - should be able to open several connections to the server to accelerate the download (like FlashGet/GetRight/...)
  2. Resume a download

Edit - I'd really prefer not to write such a library but steal it (or pay) for an existing tested, production grade library. rsynch is not relevant since I need to download files from HTTP and FTP sites, it's not for internal file transfer.

+1  A: 

The HTTP protocol does support starting a partial download at an offset, but has limited support for validating the local partial version of the file to make sure that it doesn't have junk attached to the end (or something similar). If your environment allows it, I recommend rsync with the --partial option. Its designed to support this kind of functionality from the command line.

If you can't use rsync, you may want to try working with Commons-HTTPClient and utilizing the Range HTTP header to download manageable sized chunks.

Jherico
+1  A: 

If you know how to create sockets and threads in java it's not that difficult.

First create a request and read the headers to get the Content-length header. Then devise a strategy to split your request in chunks of for example 500K each request. Then start say 10 requests using a thread for each request. In each request you have to define the Range header.

Resuming your download is a matter of storing the ranges you haven't downloaded yet. I suggest you read this HTTP/1.1 Header Fields RFC here if you really want to get a good grasp on the protocol used.

However if you're looking for an easy way out rsync or scp should suffice.

reubensammut