ansaurus

Question

C# downloading a webpage. A better way needed, CPU usage high

Answer 1

+1 A:

Is the WebClient class no use for what you want to do?

jmcd 2008-10-22 12:56:39

WebClient is fine if I don't use compression. Its looking like a may have to

Sir Psycho 2008-10-22 13:00:13

WebClient doesn't support decompression :(

Sir Psycho 2008-10-22 13:10:00

Answer 2

A:

If you want the response as a string you can do this.

String ReponseText;

IO.StreamReader ResponseReader = New IO.StreamReader(webStream );
ReponseText= ResponseReader.ReadToEnd();

If you want an actual Byte Array do this (Sorry, Don't feel like converting to C# for this one)

'Declare Array Same size as response
Dim ResponseData(webStream .Length) As Byte 
'Read all the data at once
webStream.Read(ResponseData, 0, webStream .Length)

Kibbee 2008-10-22 12:56:57

Didn't he say he had a problem with using a streamreader string?

NotJarvis 2008-10-22 12:59:03

Answer 3

A:

Kibbee, the GZipStream class only accepts a Stream object and Stream objects can only be read one byte at a time.

If you try to read in chunks faster than the data is comming it, it will fall over.

Sir Psycho 2008-10-22 12:59:14

Sorry, but no. Streams can (and usually are) read in chunks. See my example.

Marc Gravell 2008-10-22 13:02:16

This should has been put in comment to Kibbee post btw.

Daok 2008-10-22 13:08:30

Answer 4

+6 A:

I'd agree with jmcd that WebClient would be far simpler, in particular WebClient.DownloadData.

re the actual question, the problem is that you are reading single bytes, when you should probably have a fixed buffer, and loop - i.e.

int bytesRead;
byte[] buffer = new byte[1024];
while((bytesRead = webStream.Read(buffer, 0, buffer.Length)) > 0) {
  // process "bytesRead" worth of data from "buffer"
}

[edit to add emphasis] The important bit is that you only process "bytesRead" worth of data each time; everything beyond there is garbage.

Marc Gravell 2008-10-22 13:01:35

Good answer. This is likely the cause of the slowness.

samjudson 2008-10-22 13:03:56

Correct - you posted this as I was preparing an answer saying much the same....

NotJarvis 2008-10-22 13:04:35

Fixed buffer loops are fine if your internet connection can keep up otherwise it falls over. The WebClient class somehow takes care of this problem but I can't use compression if I chose to do it that way.

Sir Psycho 2008-10-22 13:08:08

Sir Psycho - how can I state this strongly enough: you are wrong...

Marc Gravell 2008-10-22 13:11:45

DownloadDataAsync might be appropriate also.

Jon Grant 2008-10-22 13:21:06

Answer 5

A:

You're not understanding what I'm saying.

The GZipStream class accepts a stream as an argument. Then you use GZipStream to read the decompressed data. unfortunately, GZipStream only allows you to read one byte at a time (reliably).

Reading blocks of data is only good if your connection can keep up, otherwise the the Length property will be 0 even though more data is on its way.

Sir Psycho 2008-10-22 13:06:49

Ok Sir Psycho, understand that posting a post is for answer and posting comment is for adding stuff to someone answer please.

Daok 2008-10-22 13:09:10

Sorry, but you are wrong. GZipStream acts like any other stream; a call to Read will block until (either of) the stream is closed [no more data], or *some* data [at least one byte] is available. It doesn't guarantee to fill your buffer.

Marc Gravell 2008-10-22 13:09:55

And note that I didn't even look at .Length...

Marc Gravell 2008-10-22 13:10:28

I'd agree with Daok. Rather than adding new answers in reply to other people, add comments to their posts or edit your original question to include additional information you would like considered.

Seth Petry-Johnson 2008-10-22 13:11:27

ansaurus

tags:

views:

answers:

C# downloading a webpage. A better way needed, CPU usage high

related questions