tags:

views:

915

answers:

5

I'm trying to get this piece of code working a little better. I suspect it's the loop reading one byte at a time. I couldn't find another way of doing this with gzip decompression. Implementing a StreamReader is fine, but it returns a string which I can't pass to the decompression stream.

Is there a better way?

byte[] bufffer = null;
List<byte> resourceBytes = new List<byte>();
int byteValue = 0;
WebResource resource = new WebResource();
HttpWebResponse webResponse = null;

try {
    HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(resourceUri);
    webRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
    webRequest.Headers.Add(HttpRequestHeader.AcceptCharset, "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
    webRequest.UserAgent = agent;
    webRequest.Accept = "text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1";
    webRequest.Credentials = CredentialCache.DefaultCredentials;
    webRequest.Referer = resourceUri.OriginalString;
    webRequest.Timeout = 5000;

    webResponse = (HttpWebResponse)webRequest.GetResponse();

    Stream webStream = webResponse.GetResponseStream();

    if (!string.IsNullOrEmpty(webResponse.ContentEncoding)) {
        if (webResponse.ContentEncoding.ToLower().Contains("gzip")) {
            webStream = new GZipStream(webStream, CompressionMode.Decompress);
        }
        else if (webResponse.ContentEncoding.ToLower().Contains("deflate")) {
            webStream = new DeflateStream(webStream, CompressionMode.Decompress);
        }
    }

    do {
        byteValue = webStream.ReadByte();

        if (byteValue != -1) {
            resourceBytes.Add((byte)byteValue);
        }

    } while (byteValue != -1);


    //Free up resources
    webStream.Close();
    webResponse.Close();

    bufffer = resourceBytes.ToArray();
+1  A: 

Is the WebClient class no use for what you want to do?

jmcd
WebClient is fine if I don't use compression. Its looking like a may have to
Sir Psycho
WebClient doesn't support decompression :(
Sir Psycho
A: 

If you want the response as a string you can do this.

String ReponseText;

IO.StreamReader ResponseReader = New IO.StreamReader(webStream );
ReponseText= ResponseReader.ReadToEnd();

If you want an actual Byte Array do this (Sorry, Don't feel like converting to C# for this one)

'Declare Array Same size as response
Dim ResponseData(webStream .Length) As Byte 
'Read all the data at once
webStream.Read(ResponseData, 0, webStream .Length)
Kibbee
Didn't he say he had a problem with using a streamreader string?
NotJarvis
A: 

Kibbee, the GZipStream class only accepts a Stream object and Stream objects can only be read one byte at a time.

If you try to read in chunks faster than the data is comming it, it will fall over.

Sir Psycho
Sorry, but no. Streams can (and usually are) read in chunks. See my example.
Marc Gravell
This should has been put in comment to Kibbee post btw.
Daok
+6  A: 

I'd agree with jmcd that WebClient would be far simpler, in particular WebClient.DownloadData.

re the actual question, the problem is that you are reading single bytes, when you should probably have a fixed buffer, and loop - i.e.

int bytesRead;
byte[] buffer = new byte[1024];
while((bytesRead = webStream.Read(buffer, 0, buffer.Length)) > 0) {
  // process "bytesRead" worth of data from "buffer"
}

[edit to add emphasis] The important bit is that you only process "bytesRead" worth of data each time; everything beyond there is garbage.

Marc Gravell
Good answer. This is likely the cause of the slowness.
samjudson
Correct - you posted this as I was preparing an answer saying much the same....
NotJarvis
Fixed buffer loops are fine if your internet connection can keep up otherwise it falls over. The WebClient class somehow takes care of this problem but I can't use compression if I chose to do it that way.
Sir Psycho
Sir Psycho - how can I state this strongly enough: you are wrong...
Marc Gravell
DownloadDataAsync might be appropriate also.
Jon Grant
A: 

You're not understanding what I'm saying.

The GZipStream class accepts a stream as an argument. Then you use GZipStream to read the decompressed data. unfortunately, GZipStream only allows you to read one byte at a time (reliably).

Reading blocks of data is only good if your connection can keep up, otherwise the the Length property will be 0 even though more data is on its way.

Sir Psycho
Ok Sir Psycho, understand that posting a post is for answer and posting comment is for adding stuff to someone answer please.
Daok
Sorry, but you are wrong. GZipStream acts like any other stream; a call to Read will block until (either of) the stream is closed [no more data], or *some* data [at least one byte] is available. It doesn't guarantee to fill your buffer.
Marc Gravell
And note that I didn't even look at .Length...
Marc Gravell
I'd agree with Daok. Rather than adding new answers in reply to other people, add comments to their posts or edit your original question to include additional information you would like considered.
Seth Petry-Johnson