views:

398

answers:

4

I'm developing a game for windows for learning purposes (I'm learning DirectX). I would like it to have UTF support.

Reading this question I learned that windows uses wchar_t, which is UTF-16. I want my game to have Lua scripting support, and Lua doesn't really like Unicode much.. It simply treats strings as a "stream of bytes"; this works well enough for UTF-8, but UTF-16 would be virtually impossible to use.

Long story short: windows wants UTF-16, lua wants UTF-8.

So I thought, let's just use UTF-8 with normal char* and string! .length() will be messed up but who cares? However it doesn't work:

const char test_utf8[] = { 111, 108, 0xc3, 0xa9, 0 }; // UTF-8 for olè
mFont->DrawTextA(0, test_utf8, -1, &R, DT_NOCLIP, BLACK);
    /* DrawText is a Direct3d function to, well, draw text.
     * It's like MessageBox: it is a define to either DrawTextA
     * or DrawTextW, depending if unicode is defined or not. Here
     * we will use DrawTextA, since we are passing a normal char*. */

This prints olé. In other words it doesn't appear to use UTF-8 but rather ISO-8859-1.

So, what can I do? I can think of the following:

  1. Abandon the idea of UTF; use ISO-8859-1 and be happy (this is what World of Warcraft does, at least for the enUS version)
  2. Convert every single string at every single frame from UTF-8 to UTF-16 (I'm worried about performance issues, though, considering it will do this 60+ times a second for each string and it's O(N) I'm pretty sure it will be fairly slow)
  3. For each lua string keep an UTF-16 copy; huge waste of memory, very difficult to implement (keeping the UTF-16 strings up to date when they change in Lua, etc)
+4  A: 

It doesn't use 8859-1 either, it uses your system's local code page. You can convert to UTF16 and use DrawText() by converting the string yourself. If your class library doesn't have any support then you can use MultiByteToWideChar().

Hans Passant
+1  A: 

한국어/ì¡°ì„ ë§. I see this all the time in StarCraft because it doesn't have proper support for Unicode.

Fight the good fight! Use UTF-8. Convert to UTF-16 every frame (unless there's a better way to do it mentioned in the docs which I'm too lazy to look at). Don't worry about performance here until it becomes a problem!

Joey Adams
+1  A: 

I wouldn't be shocked is WoW doesn't use DirectX text draw methods. Having your own custom text draw solution gives you a lot more flexibility in your support for encodings. It isn't too hard.

Torlack
That's actually not a bad idea. Do you have any links to papers/tutorials/etc that explain how to do this? (PS ran out of votes, will upvote in 60 minutes)
Andreas Bonini
+2  A: 

You can get lua to cache your conversions to UTF-16

utf16 = setmetatable ( {} , { __index = function ( t , k , v )
        local utf16str = my_conversion_func_to_utf16 ( v )
        rawset ( t , k , utf16str )
        return utf16str
    end } )

then just have all your functions only take the utf16 string types (which could be a lua string or some sort of userdata (which could be your wchar_t array))

I can help more if you don't understand...

daurnimator