views:

743

answers:

4

I need to programmatically download a large file before processing it. What's the best way to do that? As the file is large, I want to specific time to wait so that I can forcefully exit.

I know of WebClient.DownloadFile(). But there does not seem a way to specific an amount of time to wait so as to forcefully exit.

try
{
    WebClient client = new WebClient();
    Uri uri = new Uri(inputFileUrl);
    client.DownloadFile(uri, outputFile);
}
catch (Exception ex)
{
    throw;
}

Another way is to use a command line utility (I found one) to download the file and fire the command using ProcessStartInfo and use Process' WaitForExit(int ms) to forcefully exit.

ProcessStartInfo startInfo = new ProcessStartInfo();
//set startInfo object

try
{
using (Process exeProcess = Process.Start(startInfo))
{
    //wait for time specified
    exeProcess.WaitForExit(1000 * 60 *60);//wait till 1m

    //check if process has exited
    if (!exeProcess.HasExited)
    {
    //kill process and throw ex
    exeProcess.Kill();
    throw new ApplicationException("Downloading timed out");
    }
}
}
catch(Exception ex)
{
throw;
}

Is there a better way? Please help. Thanks.

A: 

What if the user is on a slow connection?

Waiting a static amount of time and then stopping is not the way to go about this.

As long as the file is still making progress, you should keep downloading, and it's up to the user if they want to cancel it.

Anon.
+4  A: 

Use a WebRequest and get the response stream. Then read from the reponse Stream blocks of bytes, and write each block to the destination file. This way you can control when to stop if the download takes too long, as you get control between chunks and you can decide if the download has timed out based on a clock:

        DateTime startTime = DateTime.UtcNow;
        WebRequest request = WebRequest.Create("http://www.example.com/largefile");
        WebResponse response = request.GetResponse();
        using (Stream responseStream = response.GetResponseStream()) {
            using (Stream fileStream = File.OpenWrite("c:\temp\largefile")) {
                byte[] buffer = new byte[4096];
                int bytesRead = responseStream.Read(buffer, 0, 4096);
                while (bytesRead > 0) {
                    fileStream.Write(buffer, 0, bytesRead);
                    DateTime nowTime = DateTime.UtcNow;
                    if ((nowTime - startTime).TotalMinutes > 5) {
                        throw new ApplicationException(
                            "Download timed out");
                    }
                    bytesRead = responseStream.Read(buffer, 0, 4096);
                }
            }
        }
Remus Rusanu
pretty complicated if all he wanted was a timeout
orip
@orip, how is it complicated?
Juan Manuel
@Juan, For one it is synchronous. The asynchronous version of this example would look very different. But also it throws out the very user-friendly WebClient facade that hides the stream management stuff that is largely irrelevant 90% of the time.
Josh Einstein
orip, your code is much simpler. one advantage of using Remus' code is that I can how much portion of the file is downloaded.
hIpPy
@hlpPy: if you preffer the WebClient.DownloadFileAsync/CancelAsync, you could use the WebClient.DownloadProgressChanged event to know the progress.
Remus Rusanu
+1  A: 

How about using DownloadFileAsync in the WebClient class. The cool thing about going this route is that you can cancel the operation by calling CancelAsync if it takes too long. Basically, call this method, and if a specified amount of time elapses, call Cancel.

BFree
+3  A: 

Asked here: http://stackoverflow.com/questions/295557/c-downloading-a-url-with-timeout

Simplest solution:

public string GetRequest(Uri uri, int timeoutMilliseconds)
{
    var request = System.Net.WebRequest.Create(uri);
    request.Timeout = timeoutMilliseconds;
    using (var response = request.GetResponse())
    using (var stream = response.GetResponseStream())
    using (var reader = new System.IO.StreamReader(stream))
    {
        return reader.ReadToEnd();
    }
}

Better (more flexible) solution is this answer to the same question, in the form of a WebClientWithTimeout helper class.

orip
The webrequest.timeout only measures the time until the HTTP response headers are received, not the total time until the response body is downloaded. Ie. it affects the time until GetResponse returns.
Remus Rusanu
Good point, I wasn't aware of that
orip