views:

308

answers:

5

Hi,

I'm wondering if you guys are aware of any article of some sort that shows how to make code fully unicode? Reason I ask is because I'm dealing with winapi right now and it seems that everything's supposed to be unicode like L"blabla" .. Functions that I've encountered won't work properly by simply using the standard string for example.

Thanks!

A: 

Check out http://msdn.microsoft.com/en-us/goglobal/bb688113.aspx. If you already know what Unicode itself is all about and just need to know about using it in Win32, start at the Creating Win32 Unicode Applications section.

Oren Trutner
A: 

regarding your last statement, if you're talking about std::string instead use std::wstring.

toto has already answered the key question: just use L"", WCHAR/wchar_t, and wstring everywhere that you'd normally use "", char, and string.

if you find yourself needing to convert between unicode and ansi, there lie the dragons. very evil dragons that will eat you alive if you don't understand code pages, UTF8, and so on. but for most types of applications this is the 2% case, if that. the rest is easy as long as you stay all-unicode.

~jewels

Jewel S
When working with WinAPI a lot, I'd actually advise to go for ATL `CString` over `std::wstring` - it uses Win32 text manipulation functions, so it aligns well with Windows locales, and is easier to use with functions that take raw `WCHAR*` pointers (such as all API functions) compared to `std::wstring`.
Pavel Minaev
A: 

Unicode for Windows applications can be summarized with a few simple rules:

  • In your project settings in Visual Studio choose Unicode (FYI: on Windows, Unicode is always UTF16).
  • Get a UTF8 to UTF16 and a UTF16 to UTF8 function. (You could write it yourself, or find it on the internet. I personally use the ones from the Poco C++ libraries).
  • In your code always use std::string for your class and function interfaces.
  • When you need to use a WinAPI call that takes a string parameter you can use the conversion functions to fill a local std::wstring, on which you can call the c_str() method to pass to the API function.
  • If you need to obtain textual data from a WinAPI function, then usually need to create local TCHAR array on the stack, pass it to the function, construct a std::wstring from the result, and then convert it back to a std::string using your UTF16 to UTF8 conversion function.

Examples:

Getting text using stack-allocated array

std::string getCurrentDirectory()
{
    TCHAR buffer[MAX_PATH];
    ::GetCurrentDirectory(sizeof(buffer)/sizeof(TCHAR), &buffer[0]);
    return std::string(ToUTF8(buffer));
}

Getting text using dynamically-allocated buffer

std::string getWindowText(HWND inHandle)
{
            std::string result;
            int length = ::GetWindowTextLength(inHandle);
            if (length > 0)
            {
                    TCHAR * buffer = new TCHAR[length+1];
                    ::GetWindowText(inHandle, buffer, length+1);
                    result = ToUTF8(buffer);
                    delete [] buffer;
            }
            return result;
}

Setting text

void setWindowText(HWND inHandle, const std::string & inText)
{
    std::wstring utf16String = ToUTF16(inText);
    if (0 == ::SetWindowText(inHandle, utf16String.c_str()))
    {
        ReportError("Setting the text on component failed. Last error: " + getLastError(::GetLastError()));
    }
}

Hope this helps.

StackedCrooked
If all the native API is UTF-16, why would one want to go to std:string every time? Stay UTF-16, use CString (ATL) or std:wstring, and only go UTF-8 "at the edge" (for import/export/communication protocols, storage, etc.)
Mihai Nita
You're right that I could use UTF-16 by default and convert to UTF-8 for the outside interfacing, but you are also converting. Since most libraries use UTF8 and only Windows uses UTF16 I choose to define my "edge" at the WinAPI interfacing level.
StackedCrooked
+2  A: 

When one of my projects need to be compiled with UNICODE on and off, I usually use the following definition to create an STL string that uses TCHAR instead of CHAR and wchar_t:

#ifdef _UNICODE
    typedef std::wstring tstring;
#else
    typedef std::string tstring;
#endif

or the following may also work:

typedef std::basic_string<TCHAR> tstring;

In my whole project I will then define all strings as tstring and use the _T() macro to create the strings correctly.

When you then call a WIN32 API just use the .c_str() method on the string.

pcdejager
I use this trick also, +1
Idan K