tags:

views:

145

answers:

2

Hello, i have a very simple question I can't seem to get my head around.

I have a properly encoded UTF8-String I parse into a JObject with Json.NET, fiddle around with some values and write it to the commandline, keeping the encoded characters intact.

Everything works great except for the keeping the encoded characters intact part.

Code:

var json = "{roster: [[\"Tulg\u00f4r\", 990, 1055]]}";
var j = JObject.Parse(json);
for (int i = 0; i < j["roster"].Count(); i++)
{
    j["roster"][i][1] = ((int)j["roster"][i][1]) * 3;
    j["roster"][i][2] = ((int)j["roster"][i][2]) * 3;
}
Console.WriteLine(JsonConvert.SerializeObject(j, Formatting.None));

Actual Output:

{"roster":[["Tulgôr",2970,3165]]}

Desired Output:

{"roster":[["Tulg\u00f4r",2970,3165]]}

It seems like my phrasing in Google is inappropriate since nothing useful came up. I'm sure it's something uber-easy and i will feel pretty stupid afterwards. :)

A: 

I'm not sure I see the problem here. The actual output contains the unicode character, it is being interpreted correctly after being specified using \u syntax. It contains the correct character, so contains the correct "bytes". Of course it will be a .Net string so Unicode, rather than UTF-8.

chibacity
Right, but I don't want to have the character interpreted in the output. I want to see the unicode representation on the command line, so I can copy-paste the resulting string into a third-party application which requires the \u, since it doesn't seem to properly parse the input otherwise.
Stefan Pohl
Ok, it looks like your third party application is expecting ASCII or UTF-8. When you copy and paste you will be copying and pasting UTF-16. What you want is ASCII escape encoded Unicode. @Bradley's answer should do the trick.
chibacity
A: 

Take the output from JsonConvert.SerializeObject and run it through a helper method that converts all non-ASCII characters to their escaped ("\uHHHH") equivalent. A sample implementation is given below.

// Replaces non-ASCII with escape sequences;
// i.e., converts "Tulgôr" to "Tulg\u00f4r".
private static string EscapeUnicode(string input)
{
    StringBuilder sb = new StringBuilder(input.Length);
    foreach (char ch in input)
    {
        if (ch <= 0x7f)
            sb.Append(ch);
        else
            sb.AppendFormat(CultureInfo.InvariantCulture, "\\u{0:x4}", (int) ch);
    }
    return sb.ToString();
}

You would call it as follows:

Console.WriteLine(EscapeUnicode(JsonConvert.SerializeObject(j, Formatting.None)));

(Note that I don't handle non-BMP characters specially, because I don't know if your third-party application wants "\U00010000" or "\uD800\uDC00" (or something else!) when representing U+10000.)

Bradley Grainger