views:

250

answers:

2

I would like to use the WebClient (or there is another better option?) but there is a problem. I understand that opening up the stream takes some time and this can not be avoided. However, reading it takes a strangely much more amount of time compared to read it entirely immediately.

Is there a best way to do this? I mean two ways, to string and to file. Progress is my own delegate and it's working good.


FIFTH UPDATE:

Finally, I managed to do it. In the meantime I checked out some solutions what made me realize that the problem lies elsewhere.

I've tested custom WebResponse and WebRequest objects, library libCURL.NET and even Sockets.

The difference in time was gzip compression. Compressed stream lenght was simply half the normal stream lenght and thus download time was less than 3 seconds with the browser.

I put some code if someone will want to know how i solved this: (some headers are not needed)

public static string DownloadString(string URL)
    {
        WebClient client = new WebClient();
        client.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1045 Safari/532.5";
        client.Headers["Accept"] = "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
        client.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
        client.Headers["Accept-Charset"] = "ISO-8859-2,utf-8;q=0.7,*;q=0.3";

        Stream inputStream = client.OpenRead(new Uri(URL));
        MemoryStream memoryStream = new MemoryStream();
        const int size = 32 * 4096;
        byte[] buffer = new byte[size];

        if (client.ResponseHeaders["Content-Encoding"] == "gzip")
        {
            inputStream = new GZipStream(inputStream, CompressionMode.Decompress);
        }

        int count = 0;
        do
        {
            count = inputStream.Read(buffer, 0, size);
            if (count > 0)
            {
                memoryStream.Write(buffer, 0, count);
            }
        }
        while (count > 0); 

        string result = Encoding.Default.GetString(memoryStream.ToArray());
        memoryStream.Close();
        inputStream.Close();
        return result;
    }

I think that asyncro functions will be almost the same. But i will simply use another thread to fire this function. I dont need percise progress indication.

+1  A: 

I'm very confused by the double-reading, but it looks like you actually intend to do something like:

        StringBuilder sb = new StringBuilder();           
        using (StreamReader reader = new StreamReader(streamRemote))
        {
            char[] charBuffer = new char[bufferSize];
            int charsRead;
            while ((charsRead = reader.Read(charBuffer, 0, bufferSize)) > 0)
            {
                sb.Append(charBuffer, 0, charsRead);
                //Some progress calculation

                if (Progress != null) Progress(iProgressPercentage);
            }
        }
        string result = sb.ToString();

See if that works as desired.. I wonder, however, if the Progress isn't the cause of the drop; try it without this assigned, see if that makes it quicker. Or only run this periodically:

            //[snip]
            int iteration = 0, charsRead;
            while ((charsRead = reader.Read(charBuffer, 0, bufferSize)) > 0)
            {
                sb.Append(charBuffer, 0, charsRead);
                //Some progress calculation
                if((++iteration % 20) == 0 && Progress != null) {
                    Progress(iProgressPercentage);
                }
            }
            //[snip]

Also, try increasing the buffer-size.

Marc Gravell
But why opening stream takes the same amount of time as web browser downloads whole page?
Kaminari
+1  A: 

You only get the last iSize bytes from your file since you overwrite your buffer on each iteration, you're not saving the buffer anywhere. Here's a sample on how to store the file in memory using a MemoryStream.

var totalBytes = new MemoryStream(1024 * 1024);
while ((iByteSize = streamRemote.Read(byteBuffer, 0, iByteSize)) > 0)
{
    totalBytes.Write(byteBuffer, 0, iByteSize);
    iRunningByteTotal += iByteSize;

    //Some progress calculation
    if (Progress != null) Progress(iProgressPercentage);
}

When the whole download is complete, then you can convert it into text.

var byteArray = totalBytes.GetBuffer();
var numberOfBytes = totalBytes.Length;
var text = Encoding.Default.GetString(byteArray, 0, numberOfBytes);

Update: the DownloadStringAsync method basically does the same as above, but will not give you any progress indication. There are some other async methods though which will fire the DownloadProgressChanged event.

Update 2: Regarding response time. Have you timed download of the resource using some other tool? Major browsers have builtin support for timing such tings.

Further, is it a static file your serving up or is the content generated on the serverside?

A third thing that comes to mind is serverside buffering. E.g. if the Response.Buffer property in ASP.Net is used, nothing will be sent to the client until the whole file/page is done serverside. Thus the client will have to wait before it can start downloading.

Peter Lillevold
Browser loads whole page in about 5s. Download String Async speed is the same. But strangly i get several 0 progress and 100 in the end. It's not as async as I think it would be.
Kaminari
@Kaminari - how large files are we talking about here? if they're small, it would probably be more efficient to serve you all at once, rather than chop the download into small pieces.
Peter Lillevold
@Peter Lillevold it's not a file. It's a website. But only html without other elements. Size about 1,33MB.
Kaminari
@Kaminari - I see. You specifically mention "file" in the question, but, yeah, this further my theory that the server is buffering while constructing the html and will thus not start flushing to the client immediately. From where I sit I get a latency of 2,5 secs while pure download is only 0,5 secs, which could be an indication on buffering.
Peter Lillevold
@Peter LillevoldYes it's one problem, unfortunately I cant avoid it. But i figured out why time of downloading was 2 times slower, check first post for my solution.
Kaminari
@Kaminari - nice, it was the gzip accept header! Btw, see my updated sample on converting the memory buffer to text. This is faster than using MemoryStream.ToArray. ToArray will make a copy of the content which in your case is 1.33MB!
Peter Lillevold
Yes, thank You it is slightly faster but the difference is rather in few ms but still better than nothing.
Kaminari