views:

43

answers:

1

I've been using HttpWebRequest/HttpWebResponse lately and I'm getting encoding problems. HttpWebResponse.CharacterSet doesn't always represent the real page encoding so I thought I could use the Content-Type meta tag.

  1. How can I read the Content-Type meta tag if I can't even decode the response (in case of wrong CharacterSet http header)?
  2. Is there an open source solution that will automatically deal with page encoding and is able to download the source of a url like a browser can?

Note that I do not require fancy methods like character set detection algorithms, just basic stuff like detection based on http header or meta tag elements.

Thanks in advance.

A: 

I used this solution. It works.

spender
Thanks! I might have to tweak it a bit but I got the concept.
James