I am trying to compile the following code in my test application on windows in visual studio for C++:
const wchar_t* chinese = "好久不见";
But I get the following error:
error C2440: 'initializing' : cannot convert from 'const char [5]' to 'const wchar_t *
I am compiling with unicode, so I am confused about this. The error goes away...
Hi,
I know unicode contains all characters from most world aphabets..but what about digits? Are they part of unicode or not? I was not able to find straight answer.
Thanks
...
Is there any way to obtain an hexadecimal dump of a string in SQL Server? It'd be useful to troubleshoot character set and collation issues.
In MySQL you'd do SELECT HEX('€uro') and in Oracle you'd do SELECT DUMP('€uro') FROM DUAL.
...
I am helping a client convert their Perl flat-file bulletin board site from ISO-8859-1 to Unicode.
Since this is my first time, I would like to know if the following "checklist" is complete. Everything works well in testing, but I may be missing something which would only occur at rare occasions.
This is what I have done so far (forgiv...
I've notice every time I put an:
import pdb; pdb.set_trace()
in My Spanish Django project, if I have a specific Unicode character in a string like:
Gracias por tu colaboración
I get a UnicodeDecodeError with an 'ordinal not in range(128)' in a Django Debug window. The problem is that I can not debug my application easily. On the ot...
Unicode simply assigns an integer to each character. UTF-8 or others are used to encode these integers ("code points") to a sequence of bytes to be stored in the memory. My question is that why can't we simply store the character as the binary representation of its Unicode value (the "code point") ? Consequently, some languages have char...
When I try to print an unicode string on my dev server it works correctly but production server raises exception.
File "/home/user/twistedapp/server.py", line 97, in stringReceived
print "sent:" + json
File "/usr/lib/python2.6/dist-packages/twisted/python/log.py", line 555, in write
d = (self.buf + data).split('\n')
exceptions.U...
I'm new to learning Unicode, and not sure how much I have to learn based on my ASCII background, but I'm reading the C# spec on rules for identifiers to determine what chars are permitted within Azure Table (which is directly based on the C# spec).
Where can I find a list of Unicode characters that fall into these categories:
letter-c...
Hi.
I'm trying to get Mako render some string with unicode characters :
tempLook=TemplateLookup(..., default_filters=[], input_encoding='utf8',output_encoding='utf-8', encoding_errors='replace')
...
print sys.stdout.encoding
uname=cherrypy.session['userName']
print uname
kwargs['_toshow']=uname
...
return tempLook.get_template(page).re...
I'm working with an existing module at the moment that provides a C++ interface and does a few operations with strings.
I needed to use Unicode strings and the module unfortunately didn't have any support for a Unicode interface, so I wrote an extra function to add to the interface:
void SomeUnicodeFunction(const wchar_t* string)
How...
According to JavaScript: the Good Parts:
JavaScript was built at a time when Unicode was a 16-bit character set, so all characters in JavaScript are 16 bits wide.
This leads me to believe that JavaScript uses UCS-2 (not UTF-16!) and can only handle characters up to U+FFFF.
Further investigation confirms this:
> String.fromCharCod...
Through this forum, I have learned that it is not a good idea to use the following for converting CGI input (from either an escape()d Ajax call or a normal HTML form post) to UTF-8:
read (STDIN, $_, $ENV{CONTENT_LENGTH});
s{%([a-fA-F0-9]{2})}{ pack ('C', hex ($1)) }eg;
utf8::decode $_;
A safer way (which for example does not allow bog...
I am using an io.StringIO object to mock a file in a unit-test for a class. The problem is that this class seems expect all strings to be unicode by default, but the builtin str does not return unicode strings:
>>> buffer = io.StringIO()
>>> buffer.write(str((1, 2)))
TypeError: can't write str to text stream
But
>>> buffer.write(str(...
I've created a memory mapped 1 bit interface to an LCD in an embedded system, along with 4 or 5 bit mapped fonts for the 90+ printable ASCII characters. Writing to the screen is as simple as using an echo like statement (it's embedded Linux).
Other than something strictly proprietory, what recommendations can people make for storing Ge...
Standard grep/pcregrep etc. can conveniently be used with binary files for ASCII or UTF8 data - is there a simple way to make them try UTF16 too (preferably simultaneously, but instead will do)?
Data I'm trying to get is all ASCII anyway (references in libraries etc.), it just doesn't get found as sometimes there's 00 between any two ch...
I have a long list of domain names which I need to generate some reports on. The list contains some IDN domains, and although I know how to convert them in python on the command line:
>>> domain = u"pfarmerü.com"
>>> domain
u'pfarmer\xfc.com'
>>> domain.encode("idna")
'xn--pfarmer-t2a.com'
>>>
I'm struggling to get it to work with a ...
I just received a virus that looks something like this
<script type='text/javascript'>
<!--
var s="=nfub!iuuq.frvjw>#sfgsfti#!------REST OF PAYLOAD REMOVED-----?";
m="";
for (i=0; i<s.length; i++)
{
if(s.charCodeAt(i) == 28)
{
m+= '&';
}
else if
(s.charCodeAt(i) == 23)
{ m+= '!';}
else
{
m+=String.fromCharCode(s...
I have some code that sorts table columns by object properties. It occurred to me that in Japanese or Chinese (non-alphabetical languages), the strings that are sent to the sort function would be compared the way an alphabetical language would.
Take for example a list of Japanese surnames:
寿拘
松坂
松井
山田
藤本
In English, these would be S...
Are these squares a representation of chinese characters being turned into unicode?
EDIT:[Here I entered the squares with numbers inside them into the post but they didn't render]
I'd like to either turn this back into the original characters when displayed in android (or to enable mysql to just store them as chinese characters not in...
I have a string with many fractions like 1/2, 1/4 etc. I want to replace them with their Unicode equivalents.
I realise I could pick them up with
/\s(\d+)\/(\d+)\s/
How would I replace them with their Unicode equivalents? I could probably wrap the numbers in span and do something similar with CSS, but I was wondering if there was an ...