views:

816

answers:

2

I have a _bstr_t string which contains Japanese text. I want to convert this string to a UTF-8 string which is defined as a char *.

Can I convert the _bstr_t string to char * (UTF-8) string without loosing the Japanese characters?

+8  A: 

Use WideCharToMultiByte() – pass CP_UTF8 as the first parameter.

Beware that BSTR can be a null pointer and that corresponds to an empty string – treat this as a special case.

sharptooth
A: 

Very handy MSDN reference for this sort of thing: http://msdn.microsoft.com/en-us/library/ms235631(VS.80).aspx

I think you need to go to wchar_t* since char* will lose the Unicode stuff, although I'm not sure.

// convert_from_bstr_t.cpp
// compile with: /clr /link comsuppw.lib

#include <iostream>
#include <stdlib.h>
#include <string>

#include "atlbase.h"
#include "atlstr.h"
#include "comutil.h"

using namespace std;
using namespace System;

int main()
{
    _bstr_t orig("Hello, World!");
    wcout << orig << " (_bstr_t)" << endl;

    // Convert to a char*
    const size_t newsize = 100;
    char nstring[newsize];
    strcpy_s(nstring, (char *)orig);
    strcat_s(nstring, " (char *)");
    cout << nstring << endl;

    // Convert to a wchar_t*
    wchar_t wcstring[newsize];
    wcscpy_s(wcstring, (wchar_t *)orig);
    wcscat_s(wcstring, L" (wchar_t *)");
    wcout << wcstring << endl;

    // Convert to a CComBSTR
    CComBSTR ccombstr((char *)orig);
    if (ccombstr.Append(L" (CComBSTR)") == S_OK)
    {
        CW2A printstr(ccombstr);
        cout << printstr << endl;
    }

    // Convert to a CString
    CString cstring((char *)orig);
    cstring += " (CString)";
    cout << cstring << endl;

    // Convert to a basic_string
    string basicstring((char *)orig);
    basicstring += " (basic_string)";
    cout << basicstring << endl;

    // Convert to a System::String
    String ^systemstring = gcnew String((char *)orig);
    systemstring += " (System::String)";
    Console::WriteLine("{0}", systemstring);
    delete systemstring;
}
Nick
Thanks for your reply Nick. The problem is that I want to send this _bstr_t content via the Windows socket which allows only char* type to be sent (please check WSABUF structure in ws2def.h file). Now a wchat wont do. Is there a wide char version of _WSABUF structure?
Manav Sharma
Windows Sockets don't care what data you send. In this case you can just reinterpret_cast to char* and be fine.
sharptooth
Just don't mess up with the number of bytes - it's number of Unicode characters times sizeof(WCHAR) - and with null BSTRs.
sharptooth
-1: This does not convert to UTF-8.
Richard