unicode

How can I escape unicode characters in a NSString?

When I store a NSString inside some NSDictionary and log that dictionary to the console like this: NSString *someString = @"Münster"; NSDictionary *someDict = [ NSDictionary dictionaryWithObjectsAndKeys: someString, @"thestring" ]; NSLog ( @"someDict: %@", [ someDict description ] ); The console output looks like this: unico...

Converting an AnsiString to a Unicode String

I'm converting a D2006 program to D2010. I have a value stored in a single byte per character string in my database and I need to load it into a control that has a LoadFromStream, so my plan was to write the string to a stream and use that with LoadFromStream. But it did not work. In studying the problem, I see an issue that tells me t...

Conversion of text to unicode strings...

I have to process JSON files that looks like this: \u0432\u043b\u0430\u0434\u043e\u043c <b>\u043f\u0443\u0442\u0438\u043c<\/b> \u043d\u0430\u0447 Unfortunately, I'm not sure how this encoding is called. I would like to convert it to .NET Unicode strings. What's the easies way to do it? Thanks in advance! ...

How to correct character encoding in IE8 native json ?

I am using json with unicode text, and having a problem with the IE8 native json implementation. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <script> var stringified = JSON.stringify("สวัสดี olé"); alert(stringified); </script> Using json2.js or FireFox native json, the alert() string is the same as...

Using a Unicode format for Python's `time.strftime()`

I am trying to call Python's time.strftime() function using a Unicode format string: u'%d\u200f/%m\u200f/%Y %H:%M:%S' (\u200f is the "Right-To-Left Mark" (RLM).) However, I am getting an exception that the RLM character cannot be encoded into ascii: UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' in position 2:...

llvm-clang; function/variable names containing unicdoe charactrs

Hi! I'm interested in using unicode characters (like \apha) in function/varaible names in my c++ program which I will compile with clang++ on linux. Does anyone know of a good guide / list of rules to go by for making sure that everything ends up compiling fine / aoiding linking errors / ... Thanks! ...

How do I create JavaScript escape sequences in PHP?

I'm looking for a way to create valid UTF-16 JavaScript escape sequence characters (including surrogate pairs) from within PHP. I'm using the code below to get the UTF-32 code points (from a UTF-8 encoded character). This works as JavaScript escape characters (eg. '\u00E1' for 'á') - until you get into the upper ranges where you get sur...

Hmm, why finding by '2' or '2' return the same record?

Hi everyone, forgive my newbie question, but why finding by '2' or '2' in Mysql returns the same record? For example: Say I have a record with string field named 'slug', and the value is '2'. And the following SQLs returns same record. SELECT * From articles WHERE slug='2' SELECT * From articles WHERE slug='2' ...

PHP function to convert unicode to special characters?

Is there a php function to handle the encodings below? .replaceAll("\u00c3\u0080", "&Agrave;") .replaceAll("\u00c3\u0081", "&Aacute;") .replaceAll("\u00c3\u0082", "&Acirc;") .replaceAll("\u00c3\u0083", "&Atilde;") .replaceAll("\u00c3\u0084", "&Auml;") .replaceAll("\u00c3\u0085", "&Aring;") .replaceAll("\u00c3\u0086", "&AE...

Delphi: Any StringReplaceW or WideStringReplace functions out there?

Are there any wide-string manipulation implementations out there? function WideUpperCase(const S: WideString): WideString; function WidePos(Substr: WideString; S: WideString): Integer; function StringReplaceW(const S, OldPattern, NewPattern: WideString; Flags: TReplaceFlags): WideString; etc ...

SHA-1 and Unicode

Hi everyone, Is behavior of SHA-1 algorithm defined for Unicode strings? I do realize that SHA-1 itself does not care about the content of the string, however, it seems to me that in order to pass standard tests for SHA-1, the input string should be encoded with UTF-8. ...

python unichr problem

I've got some problem with unichr() on my server. Please see below: On my server (Ubuntu 9.04): >>> print unichr(255) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in position 0: ordinal not in range(128) On my desktop (Ubuntu 9.10): >>> prin...

Problem with literal arguments in the PATTERN string for a python 2to3 fixer

Hi folks. I'm writing a fixer for the 2to3 tool in python. In my pattern string, I have a section where I'd like to match an empty string as an argument, or an empty unicode string. The relevant chunk of my pattern looks like: (args='""' | args='u""') My issue is the second option never matches. Even if it's alone, it won't match. Ho...

Unicode and PHP - am I doing something wrong?

I'm using Kohana 3, which has full support for Unicode. I have this as the first child of my <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> The Unicode character I am inserting into is é as in Café. However, I am getting the triangle with a ? (as in could not decode character). As far as I can tell in...

PHP regex question : how to match none-ascii letters in latin1_swedish_ci charset?

I have this string : Verbesserungsvorschläge which I think is in German. Now I want to match it with a regex in php. To be more general, I want to match such characters like German which are not 100% in the ASCII set. Thanks. ...

NSString - Unicode to ASCII equivalent

Hello, I need to convert NSString in unicode to NSString in ASCII changing all local characters: Ą to A, Ś to S, Ó to O, ü to u, And do on... What is the simplest way to do it? ...

Removing non-breaking spaces from strings using Python

Hello: I am having some trouble with a very basic string issue in Python (that I can't figure out). Basically, I am trying to do the following: '# read file into a string myString = file.read() '# Attempt to remove non breaking spaces myString = myString.replace("\u00A0"," ") '# however, when I print my string to output to consol...

Unicode filename to python subprocess.call()

I'm trying to run subprocess.call() with unicode filename, and here is simplified problem: n = u'c:\\windows\\notepad.exe ' f = u'c:\\temp\\nèw.txt' subprocess.call(n + f) which raises famous error: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' Encoding to utf-8 produces wrong filename, and mbcs passes filena...

Unicode and URI encoding, decoding and escaping in JavaScript

If you look at this table here, it has a list of escape sequences for Unicode characters that don't actually work for me. For example for "%96", which should be a –, I get an error when trying decode: decodeURIComponent("%96"); URIError: URI malformed If I attempt to encode "–" I actually get: encodeURIComponent("–"); "%E2%80%93" ...

Why does Python print unicode characters when the default encoding is ASCII?

From the Python 2.6 shell: >>> import sys >>> print sys.getdefaultencoding() ascii >>> print u'\xe9' é >>> I expected to have either some gibberish or an Error after the print statement, since the "é" character isn't part of ASCII and I haven't specified an encoding. I guess I don't understand what ASCII being the default encoding me...