Let's say I have a random Chinese character, 玩. I want to convert it to Unicode, which would be U+73A9. How could I do this in C#?
+2
A:
The characater 玩 is in Unicode.
If you have it in C# as 玩, then it's currently in UTF-16, which is one of the Unicode encoding forms.
If you are obtaining it from somewhere else you need to:
- Find the encoding it is in.
- Get the bytes (wrapped by a stream is nice).
- Get of write an appropriate Encoder.
- Use the encoder to get the string (wrapping the nice stream with a textreader is nicer).
Step 3 May be simple (oh, I just use that one!) or hard (darn, have to write it myself!) or somewhere in between (hey, anyone written one of these already?!)
Jon Hanna
2010-08-26 02:10:07
What I mean is I want to turn the character into U+73A9
Mass
2010-08-26 02:40:06
char c = '\u73a9';
GregS
2010-08-26 02:47:43
@Greg- thanks, but I want it the other way around. I want something like 玩 -> \u73a9
Mass
2010-08-26 02:50:32
+3
A:
Take myChar as a char referencing your special character...
Console.WriteLine("{0} U+{1:x4} {2}", myChar, (int)myChar, (int)myChar);
Above we're outputting the character itself followed by the Unicode code point and then the integer value.
Reduce the format string and parameters to output only the "U+..." code...
Console.WriteLine("U+{0:x4}", (int)myChar);
Allbite
2010-08-26 03:24:45
Thanks, this is awesome! Could you explain the code to me though? I understand you are just writing the U+, but what is `{0:x4}`? I know one of them is some specifier, so what is `:x4`?
Mass
2010-08-26 04:00:46
A:
A bit longer example, that follows the pattern in Jon Hanna's answer:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace UnicodeDecodeConsoleApplication
{
class Program
{
static void Main(string[] args)
{
char c = '\u73a9';
char[] chars = {c};
Encoding encoding = Encoding.BigEndianUnicode;
byte[] decodeds = encoding.GetBytes(chars);
StringBuilder stringBuilder = new StringBuilder("U+");
foreach (byte decoded in decodeds)
{
stringBuilder.Append(decoded.ToString("x2"));
}
Console.WriteLine(stringBuilder);
Console.ReadLine();
}
}
}
--jeroen
Jeroen Pluimers
2010-08-26 04:05:57