views:

552

answers:

3

I have a .NET 2.0 WinForms app that connects to a backend WAS server. I am using GZipStream to decode data coming back from a HttpWebRequest call made to the server. The data returned is compressed CSV, which Apache is compressing. The entire server stack is Hibernate-->EJB-->Spring-->Apache.

For small responses, the performance is fine (<50ms). When I get a response >150KB, it takes more than 60 seconds to decompress. The majority of the time seems to be spent in the GZipStream constructor.

This is the code showing where I get the response stream from the HttpWebResponse call:

using (Stream stream = this.Response.GetResponseStream())
{
 if (this.CompressData && this.Response.ContentEncoding == "gzip")
 {
        // Decompress the response
  byte[] b = Decompress(stream);
  this.ResponseBody = encoding.GetString(b);
    }
 else
 {
  // Just read the stream as a string
  using (StreamReader sr = new StreamReader(stream))
  {
   this.ResponseBody = sr.ReadToEnd();
  }
 }
}

Edit 1

Based on the comment from Lucero, I modified the Decompress method to the following, but I do not see any performance benefit from loading the ResponseStream into a MemoryStream before instantiating the GZipStream.

private static byte[] Decompress(Stream stream)
{
 using (MemoryStream ms = new MemoryStream())
 {
  byte[] buffer = new byte[4096];
  int read = 0;

  while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
  {
   ms.Write(buffer, 0, read);
  }

  ms.Seek(0, SeekOrigin.Begin);

  using (GZipStream gzipStream = new GZipStream(ms, CompressionMode.Decompress, false))
  {
   read = 0;
   buffer = new byte[4096];

   using (MemoryStream output = new MemoryStream())
   {
    while ((read = gzipStream.Read(buffer, 0, buffer.Length)) > 0)
    {
     output.Write(buffer, 0, read);
    }

    return output.ToArray();
   }
  }
 }
}

Based on the code above, can anyone see any issues? This seems quite basic to me, but it's driving me nuts.

Edit 2

I profiled the application using ANTS Profiler, and during the 60s of decompression, the CPU is near zero and the memory usage does not change.

Edit 3

The actual slowdown appears to be during the read of

this.Response.GetResponseStream
The entire 60s is spent loading the response stream into the MemoryStream. Once it's there, the call to GZipStream is quick.
Edit 4

I found that using HttpWebRequest.AutomaticDecompression exhibits the same performance issue, so I'm closing this question.

+1  A: 

Try first loading the data into a MemoryStream and then decompress the MemoryStream...

Lucero
I tried this - see the modified question. Thank you for the suggestion.
Joe
I see. Is the time still spent in the constructor of the GZip stream, or now somewhere else?
Lucero
This is (as far as I can tell) spent in the constructor of the GZip stream.
Joe
Accessing the same URI with a brpwser (Firefox, IE, whatever) works fine without delay?
Lucero
Yes - I can access it using a curl script and it returns without delay. And, the curl script is using compression (--compressed argument to curl).
Joe
A: 

Sorry to not answer your question directly, but have you looked at SharpZip yet? I found it much easier to use than Gzip. If you have trouble solving your current problem, perhaps it would perform the task better.

http://www.icsharpcode.net/OpenSource/SharpZipLib/

Cj Anderson
I have tried SharpZipLib and it exhibits the same poor performance as both System.IO.Compression.GZipStream and DotNetZip. I am going to step through the SharpZipLib source to see if anything jumps out at me.
Joe
Interesting... I have a large xml file which is about 70 megs uncompressed that decompresses in about 15 seconds on a system. I'm starting to wonder if it is really related to your code. Could you take a look at your Antivirus on that system? Perhaps it is hanging up. We've had major problems with Etrust from IBM hanging up files for much longer than they should. I can provide a code sample if you like but again I think it's not code related.
Cj Anderson
I'm trying to think of what else could be your bottle neck. You could try running a memory tester on that system. Maybe it has some faulty RAM? I'm just brain storming for ya. Just seems odd.
Cj Anderson
A: 

DotNetZip has a GZipStream class that can be used as a drop-in replacement for the System.IO.Compression.GZipStream.

DotNetZip is free.

NB: If you are only doing GZipStream, then you need the Ionic.Zlib.dll, not the Ionic.Zip.dll.

Cheeso
I tried using the DotNetZip/Zlib library but found the same performance issue.
Joe
If that's the case then it seems like it's not the DeflateStream. Maybe you have a memory issue. Maybe you should test more iterations - it's difficult to draw conclusions on performance based on a single iteration, a single trial.
Cheeso
I don't follow what you mean on "test more iterations"? This is one of many requests to the same server. The majority of the requests only get ~<10k data back. This is the only "large" request, and it's only ~150k.
Joe
Like I said in the other comment, I don't think it is a code problem. Check your server logs to see what is happening right when you fire that code. Is Antivirus locking that file for a brief moment? Gotta be something.
Cj Anderson