I'm developing a game for windows for learning purposes (I'm learning DirectX). I would like it to have UTF support.
Reading this question I learned that windows uses wchar_t
, which is UTF-16. I want my game to have Lua scripting support, and Lua doesn't really like Unicode much.. It simply treats strings as a "stream of bytes"; this works well enough for UTF-8, but UTF-16 would be virtually impossible to use.
Long story short: windows wants UTF-16, lua wants UTF-8.
So I thought, let's just use UTF-8 with normal char*
and string
! .length()
will be messed up but who cares? However it doesn't work:
const char test_utf8[] = { 111, 108, 0xc3, 0xa9, 0 }; // UTF-8 for olè
mFont->DrawTextA(0, test_utf8, -1, &R, DT_NOCLIP, BLACK);
/* DrawText is a Direct3d function to, well, draw text.
* It's like MessageBox: it is a define to either DrawTextA
* or DrawTextW, depending if unicode is defined or not. Here
* we will use DrawTextA, since we are passing a normal char*. */
This prints olé
. In other words it doesn't appear to use UTF-8 but rather ISO-8859-1.
So, what can I do? I can think of the following:
- Abandon the idea of UTF; use ISO-8859-1 and be happy (this is what World of Warcraft does, at least for the enUS version)
- Convert every single string at every single frame from UTF-8 to UTF-16 (I'm worried about performance issues, though, considering it will do this 60+ times a second for each string and it's O(N) I'm pretty sure it will be fairly slow)
- For each lua string keep an UTF-16 copy; huge waste of memory, very difficult to implement (keeping the UTF-16 strings up to date when they change in Lua, etc)