Say I have a text "Բարև Hello Здравствуй". (I save this code in QString, but if you know other way to store this text in c++ code, you'r welcome.) How can I convert this text to Unicode escapes like this "\u1330\u1377\u1408\u1415 Hello \u1047\u1076\u1088\u1072\u1074\u1089\u1090\u1074\u1091\u1081" (see here)?
You have to first determine which Coding is used for the text "Բարև Hello Здравствуй", looks like Russian, may be Win Code Page 1251. OR UTF-8 or something else. Then Use window function MultiByteToWideChar with required inputs such as Applied Code page, OriginalName, etc.
Hope it helps.
I assume you're doing code-generation (of JavaScript, maybe?)
QString
is like a collection of QChar
. Loop through the contents, and on each QChar
call the unicode
method to get the ushort
(16-bit integer) value.
Then format each character like "\\u%04X"
, i.e. \u
followed by the 4-digit hex value.
NB. You may need to swap the two bytes (the two hex characters) to get the right result depending on the platform you're running on.
#include <cstdio>
#include <QtCore/QString>
#include <QtCore/QTextStream>
int main() {
QString str = QString::fromWCharArray(L"Բարև Hello Здравствуй");
QString escaped;
escaped.reserve(6 * str.size());
for (QString::const_iterator it = str.begin(); it != str.end(); ++it) {
QChar ch = *it;
ushort code = ch.unicode();
if (code < 0x80) {
escaped += ch;
} else {
escaped += "\\u";
escaped += QString::number(code, 16).rightJustified(4, '0');
}
}
QTextStream stream(stdout);
stream << escaped << '\n';
}
Note this loops over UTF-16 code units, not actual code points.
I have solved the problem with this code:
EDITED TO A BETTER VERSION: (I just do not want to convert latin symbols to unicode, because it will consume aditional space without and advantage for my problem (want to remind that I whant to generate unicode RTF)).
int main(int argc, char *argv[])
{
QApplication app(argc, argv);
QTextCodec::setCodecForTr(QTextCodec::codecForName("UTF-8"));
QString str(QWidget::tr("Բարև (1-2+3/15,69_) Hello {} [2.63] Здравствуй"));
QString strNew;
QString isAcsii;
QString tmp;
foreach(QChar cr, str)
{
if(cr.toAscii() != QChar(0))
{
isAcsii = static_cast<QString>(cr.toAscii());
strNew+=isAcsii;
}
else
{
tmp.setNum(cr.unicode());
tmp.prepend("\\u");
strNew+=tmp;
}
}
QMessageBox::about(0,"Unicode escapes!",strNew);
return app.exec();
}
Thanks to @Daniel Earwicker for the algorithm and of course +1.
BTW you need to specify UTF-8 for text editor encoding.