tags:

views:

125

answers:

2

So I have a string like Русское Имя how to represent it as real string with wcf textBox? And How to encode for example russian string inputed into textInput into UTF-8?

A: 

you could use something like this (untested):

string outString = Regex.Replace(inString, "&#(<num>[0-9]+);", delegate
{
    short num = short.Parse(m.Groups["num"].ToString());
    return ((char)num).ToString();
});

of course, if performance is critical then you should avoid regular expressions and just use a hand written processor (use a for loop to scan across the string until you find an ampersand followed by a hash, extract the number, etc).

as for converting to utf-8, .NET stores all it's strings internally as Unicode and so the encoding usually only matters when it comes time to convert to some form of binary representation. I this case you can use the following:

Encoding.UTF8.GetBytes(inString)
+2  A: 

To decode this and other HTML encoded strings, use HtmlDecode() like below:

System.Web.HttpUtility.HtmlDecode("&#1056;&#1091;&#1089;&#1089;&#1082;&#1086;&#1077; &#1048;&#1084;&#1103;")

This decodes to Русское Имя. As for UTF-8, just like bwreichle said, you can use:

Encoding.UTF8.GetBytes(@"Русское Имя")
wpfwannabe