unicode

How to deal with Unicode strings in C/C++ in a cross-platform friendly way?

On platforms different than Windows you could easily use char * strings and treat them as UTF-8. The problem is that on Windows you are required to accept and send messages using wchar* strings (W). If you'll use the ANSI functions (A) you will not support Unicode. So if you want to write truly portable application you need to compile ...

problem using getline with a unicode file

UPDATE: Thank you to @Potatoswatter and @Jonathan Leffler for comments - rather embarrassingly I was caught out by the debugger tool tip not showing the value of a wstring correctly - however it still isn't quite working for me and I have updated the question below: If I have a small multibyte file I want to read into a string I use the...

How to determine if a character is a chinese character

How to determine if a character is a chinese character use ruby? ...

Encoding issue with form and HTML Purifier / MySQL

Driving me nuts... Page with form is encoded as Unicode (UTF-8) via: <meta http-equiv="content-type" content="text/html; charset=utf-8"> entry column in database is text utf8_unicode_ci copying text from a Word document with " in it, like this: “1922.” is insta-fail and ends up in the database as â��1922.â�� (typing new data into the...

Access 2007 and Special/Unicode Characters in SQL

I have a small Access 2007 database that I need to be able to import data from an existing spreadsheet and put it into our new relational model. For the most part this seems to work pretty well. Part of the process is attempting to see if a record already exists in a target table using SQL. For example, if I extract book information o...

How to open a file with UNICODE filename on Windows?

There is a 3rd lib only accept char* filename e.g. 3rdlib_func_name(char* file_name). Every things get wrong when I provide a filename in Chinese or Japanese. Is there any way to make this lib open UNICODE filename? The program is running on Windows. Thanks for your reply. ...

Django: Unicode Filenames with ASCII headers?

I have a list of strangely encoded files: 02 - Charlie, Woody and You/Study #22.mp3 which I suppose isn't so bad but there are a few particular characters which Django OR nginx seem to be snagging on. >>> test = u'02 - Charlie, Woody and You/Study #22.mp3' >>> test u'02 - Charlie, Woody and You\uff0fStudy #22.mp3' I am using nginx as ...

Unicode escaping in C/C++

Hi guys! I'm having a dispute with a colleague of mine. She says that the following: char* a = "\x000aaxz"; will/can be seen by the compiler as "\x000aa". I do not agree with her, as I think you can have a maximum number of 4 hex characters after the \x. Can you have more than 4 hex chars? Who is right here? ...

How to compute a unicode string which bidirectional representation is specified?

Hello, fellows. I have a rather pervert question. Please forgive me :) There's an official algorithm that describes how bidirectional unicode text should be presented. http://www.unicode.org/reports/tr9/tr9-15.html I receive a string (from some 3rd-party source), which contains latin/hebrew characters, as well as digits, white-spaces, ...

GWT Characters with accent not recognised when added programmatically

I am using the UIBinder in GWT but I have problems displaying letters with an accent. My xml looks like this <!DOCTYPE ui:UiBinder SYSTEM "http://dl.google.com/gwt/DTD/xhtml.ent"&gt; <ui:UiBinder xmlns:ui="urn:ui:com.google.gwt.uibinder" xmlns:g="urn:import:com.google.gwt.user.client.ui"> ... <g:Label ui:field="lbl"></Label> I...

I need to use "ö","ä"."ü" in my program, but java/android won't let me

Hi all, I need my program to send a request to a server. The problem is, the server only recognizes ös,äs und üs, but JAVA and/or Android don't know them. `How can I send a request with a String like "Hermann-Löns" without JAVA/Android "changing" the ö.... Oh and btw., "oe" isn't recognized by the server too, already tried that... thx f...

File.open with ruby on windows with a unicode filename

I have a script running on Ruby 1.9.1 on Windows 7 I've distilled my script down to File.open("翻譯測試.txt") and still can't get it to work. I know there are issues with Ruby 1.9 filename handling on windows (Using the Windows ANSI library), but would be happy enough with a work around that is callable from Ruby ...

Unicode characters in URLs

In 2010, would you serve URLs containing UTF-8 characters in a large web portal? Unicode characters are forbidden as per the RFC on URLs (see here). They would have to be percent encoded to be standards compliant. My main point, though, is serving the unencoded characters for the sole purpose of having nice-looking URLs, so percent enc...

Any hints for those that want to upgrade from Delphi 7 (and down) to Delphi 2010?

Hi. After update 4 and 5 I am interested to re-evaluate Delphi 2010. This time I intend to port some of my code (small scale) to see how difficult is to do it at large scale. The main issue seems to be the ascii to unicode conversion. Any tips or resources about this that you have found useful? Many thanks. Edit: At this point my r...

How do you print raw UTF-8 characters from their numbers? [PHP]

Say I wanted to print a ÿ (latin small y with diaeresis) from its Unicode/UTF-8 number of U+00FF or hex of c3 bf. How can I do that in PHP? The reason is that I need to be able to create certain UTF-8 Characters is for testing in my regex and string functions. However, since I have less than 200 keys on my keyboard I can't type them - a...

How would you create a string of all UTF-8 characters? [PHP]

There are many ways to represent the +1 million UTF-8 characters. Take the latin capital "A" with macron (Ā). This is unicode code point U+0100, hex number 0xc4 0x80, decimal number 196 128, and binary 11000100 10000000. I would like to create a collection of the first 65,535 UTF-8 characters for use in testing applications. These are a...

c++ win32 get utf8 char from keyboard

hello, how would i read keystrokes using the win32 api? i would also like to see them from international keyboards like german umlauts. thanks ...

Load JSON in Python as header character set

Hi everyone, I've always found character sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character set like so: Content-Type: text/html; charset=UTF-8 ...

JSON specifies "any UNICODE character"?

Maybe this is just my unfamiliarity with unicode, so please correct me if I'm mistaken. Looking at http://json.org/, the spec says that a string can include "any UNICODE character", but this confuses me. JSON is a communication format correct? At the core of it, everything must translate down to bytes. In contrast, UNICODE is a logic...

utf8 and unicode getting warning messages in mysql

I have a mysql table. When I try to insert, I get this: Warning: Incorrect string value: '\xAE</...' for column 'value' at row 1 mysql> show create table Configurations; | Configurations | CREATE TABLE `Configurations` ( `id` int(11) NOT NULL AUTO_INCREMENT, `title` varchar(255) NOT NULL, `ckey` varchar(255) NOT NULL, `value` ...