ansaurus

Question

how to detect WebPage charset,and get page content?

Answer 1

+1 A:

There is really no easy way to detect the proper charset. You can hope that the web page you are interested in declares the charset using a <meta charset="utf-8"> tag. When you detect that tag you could switch charset of your parsing.

There are also some libraries that make an effort to detect the charset, for example http://jchardet.sourceforge.net/.

Kristoffer E 2010-08-23 07:34:36

ansaurus

tags:

views:

answers:

how to detect WebPage charset,and get page content?

related questions