tags:

views:

148

answers:

1

I'm making http requests using winsock and I need to parse the html. The problem is that some sites I'm working with compress the html in gzip no matter what I specify in my request header. I've even tried downgrading the request to HTTP/1.0 with no success. So now I'm forced to actually decompress the gzip. However, Im having no success. I had my program write the zipped content to a text file. Then I compiled gzip and tried to use the txt file as the input. It gave an error saying that it was a multi part gzip file. After some reading I found out that this is caused by the gzip header being corrupted due to not being handled as a binary. I'm not sure what to do at this point.

+1  A: 

When you write the gzipped data to a file, have you opened it as a binary file? Assuming you are using C as in the title, did you open with fopen(..., "wb")?

Thomas Padron-McCarthy
I tried this but it didn't help. It actually wrote the exact same thing to the file.
silverbandit91
Are you reading from the web server using a FILE stream? In that case, is that one opened as binary?
Thomas Padron-McCarthy
nevermind i got it. thank you for your help.
silverbandit91