views:

54

answers:

2

Hi,

I am obtaining strings from the web which often contain accented characters not recognised from within my application.

Edit - I'm obtaining my string using the HtmlAgilityPack. I am taking the InnerText of a <title> tag. Whilst doing this the Pack uses a different encoding from the original HTML document (I'm not sure which ones though?).

        // get the html title inner text and assign to htmlParts object
        HtmlNode titleNode = doc.DocumentNode.SelectSingleNode("//title");
        string docTitle = titleNode.InnerText;
        htmlParts.htmlTitle = docTitle.ToString();

Can anyone tell me how I can go from getting "(Subtitulado al español).avi" to "(Subtitulado al español).avi" ?

I'd very much appreciate it. :)

A: 

apply proper encoding to the data you read. How exactly? Good question. For that you at least need to provide the code that causes the problem in the first place.

liho1eye
+2  A: 

It looks like you're getting UTF-8, but processing it as ISO-8859-1.

It's not possible to give more concrete information without knowing more about your system.

Michael Madsen
I have updated the question as suggested... hope that helps.
AlexW