views:

11

answers:

1

I am using URLOpenPullStream along with a IBindStatusCallback and IHttpNegotiate callbacks to handle the negotiate, status, and data messages. Problem that I have is when the content is gzip (e.g. Content-Encoding: gzip). The data that I am receiving via OnDataAvailable is compressed. I need the uncompressed data. I am using BINDF_PULLDATA | BINDF_GETNEWESTVERSION | BINDF_NOWRITECACHE binding flags. I have read some posts that says it should support gzip format.

I initially tried to change the Accept-Encoding request header to specify that I did not want gzip but was unsucessful with this. I can change or add headers in BeginningTransaction, but it fails to change Accept-Content. I was able to change the User-Agent, and was able to add a new header, so the process works, but it would not override the Accept-Content for some reason.

Other option is to un-gzip the data myself. In a quick test using a C++ gzip library, I was able to ungzip the content. So, this may be an option. If this is what I need to do, what is the best method to detect it is gzip. I noticed that I got an OnProgress event with BINDSTATUS_MIMETYPEAVAILABLE and the text set to "application/x-gzip-compressed". Is this how I should detect it?

Looking for any solution to get around this problem! I do want to stay with URLOpenPullStream. This is a product that has been released and wish to keep changes to the minimum.

A: 

I will answer my own question after more research. It seems that the website that I having the issue with is returning something incorrect where IE, FF, and URLOpenPullStream do not recognize it as valid gzip content. The headers appear to be fine, e.g.

HTTP/1.1 200 OK Content-Type: text/html; charset=iso-8859-1 Content-Encoding: none Server: Microsoft-IIS/6.0 MSNSERVER: H: COL102-W41 V: 15.4.317.921 D: 2010-09-21T20:29:43 Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 4258 Date: Wed, 27 Oct 2010 20:48:15 GMT Connection: keep-alive Set-Cookie: xidseq=4; domain=.live.com; path=/ Set-Cookie: LD=; domain=.live.com; expires=Wed, 27-Oct-2010 19:08:15 GMT; path=/ Cache-Control: no-cache, no-store Pragma: no-cache Expires: -1 Expires: -1

but URLOpenPullStream just downloaded it in raw compressed format, IE reports an error if you try to access the site, and FF shows garbage.

After doing a test with a site that does return valid gzip content, e.g. www.webcompression.org, then IE, FF, and URLOpenPullStream worked fine. So, it appears that URLOpenPullStream does support gzip content. In this case, it was transparent. In OnDataAvailable, I received the uncompressed data, and in the OnResponse, the headers did not show the Content-Encoding as gzip.

Unfortunately, this still did not solve my problem. I resolved by checking the response headers in OnResponse event. If the Content-Encoding was gzip, then I set a flag and when the download was complete, then used zlib gzip routines to uncompress the content. This seemed to work fine. This should be fine for my rare case since typically I should never receive a Content-Encoding : gzip in the OnResponse headers since the URLOpenPullStream handles the uncompress transparently.

Dunno :)

Ron