views:

49

answers:

3

Hi all,

I'm now trying to parse chrome bookmarks, but I encounter a problem. the bookmarks snippet is presented as follow:

    {
        "date_added": "12915566290018721",
        "id": "16",
        "name": "hao123\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875",
        "type": "url",
        "url": "http://www.hao123.com/"
     }

the string coding corresponding to name field is stored as "hao123\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875", but it should be "hao123--我的上网主页" to provide to users. How can I transform "hao123\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875" to "hao123--我的上网主页"?

thanks!
+1  A: 

What you're looking at are UTF-16 code points in the string. Unless you have a JSON library that handles Unicode for you, consider iterating the string and looking for the escape sequence that denotes the UTF-16 code point "\u". From there you can transform the string to whatever encoding is necessary for it to output properly.

fbrereto
Thanks! I use a Jsoncpp to parse it and get a C++ std::string, but I don't know how to transform it. Is that any function like MultiByteToWideChar?
Dan
Can anyone tell me how to transform the name field as std::string to wchar_t? or provide me relative materials. thanks!
Dan
A: 

As far as I can tell from looking at the Jsoncpp source code, it looks like it should decode the string properly and you'll get a UTF-8 string back. If that is not what you're seeing, please post the code that you're actually using, and what you're getting back instead.

Dean Harding
You are right, my code is : std::ifstream infile("Bookmarks"); Json::Value root; Json::Reader reader; bool ok = reader.parse(infile,root); if(!ok) { return; } std::string name = root.get("name","").asString(); cout<<name<<endl;when I output it to a file, open it in VIM, the result is :hao123锛嶏紞鎴戠殑涓婄綉涓婚〉but if I open it in Microsoft word, it promoted me to select UTF8 encoding and get the right answer. So I do think it is a UTF8 encoding one.
Dan
So I try this code, but it does not get the right answer, why: std::string name = root.get("name","").asString(); cout<<name<<endl; int len=strlen(name.c_str())+1; WCHAR outName[MAX_PATH]; MultiByteToWideChar(CP_UTF8, 0, name.c_str(), len, outName, len); wcout<<outName<<endl;
Dan
A: 

Thanks codeka, I solve the problem.

 std::string name = root.get("name","").asString(); 
 cout<<name<<endl;

 int len=strlen(name.c_str())+1;
 WCHAR outName[MAX_PATH];
// MultiByteToWideChar(CP_UTF8, 0, name.c_str(), len, outName, len);


 char outch[MAX_PATH];
 WCHAR * wChar=new WCHAR[len];
 wChar[0]=0;
 MultiByteToWideChar(CP_UTF8, 0, name.c_str(), len, wChar, len);
 WideCharToMultiByte(CP_ACP, 0, wChar, len, outch , len, 0, 0);
 delete [] wChar;




  cout<<outch<<endl;

Thanks codeka & fbrereto again.

Dan