unicode

String to wstring conversion on OS X

Hello, I'm trying to convert a C++ string to a wstring. I found the following code, that seems to deal with accents, which is what I'm looking for. std::wstring widen(const std::string& s) { std::vector<wchar_t> buffer(s.size()); std::locale loc("fr_FR"); std::use_facet< std::ctype<wchar_t> >(loc).widen(s.data(), s.data() + ...

finding unicode for non-english characters

I have to print a non-english string in a Java program. I have the string with me. How do I get the unicode of its constituent characters so that I am embed the string within the program? ...

Python raw strings and unicode : how to use Web input as regexp patterns ?

EDIT : This question doesn't really make sense once you have picked up what the "r" flag means. More details here. For people looking for a quick anwser, I added on below. If I enter a regexp manually in a Python script, I can use 4 combinations of flags for my pattern strings : p1 = "pattern" p2 = u"pattern" p3 = r"pattern" p4 = ru"p...

What exactly do "u" and "r"string flags in Python, and what are raw string litterals ?

While asking this question, I realized I didn't know much about raw strings. For somebody claiming to be a django trainer, this suck. I know what an encoding is, and I know what "u" alone does since I get what is unicode But what does "r" exactly, what kind of string does it result in ? And above all, what the heck can "ur" do ? Fina...

How do I match unicode characters in antlr

Hello, I am trying to pick out all tokens in a text and need to match all Ascii and Unicode characters, so here is how I have laid them out. fragment CHAR : ('A'..'Z') | ('a'..'z'); fragment DIGIT : ('0'..'9'); fragment UNICODE : '\u0000'..'\u00FF'; Now if I write my token rule as: TOKEN : (CHAR|DIGIT|UNICODE)+; I get ...

Nasty unicode and C++: Easy way to read ASCII/UTF-8/UTF-16 BE/LE text file

Hello everyone, sorry if the question is stupid and has been asked thousands of times but I spent a few hours googling it and could not find an answer. I want to read in text file which can be any of these: ASCII/UTF-8/UTF-16 BE/LE I assume that if file is unicode then BOM is always present. Is there any automatic way (STL,Boost or so...

SQLAlchemy and Can't Adapt.

Hi. I have the following exception when using sqlalchemy on postgres: raise exc.DBAPIError.instance(statement, parameters, e, connection_invalidated=is_disconnect) ProgrammingError: (ProgrammingError) can't adapt 'UPDATE doc_data SET content=%(content)s WHERE doc_data.serial_id = %(doc_data_serial_id)s' {'content': 'Progr...

iPhone Unicode Text with CoreGraphics

Hi, I'm using CGContextShowGlyphsAtPoint in my application to render characters that follow a path. I'd had no problems with this at all until I started on i18n of the application and found that Japanese characters appear fine everywhere apart from when I try to render them using this function. Is there a way to have these characters a...

Finding out Unicode character name in .Net

Is there a way in .Net to find out, what Unicode name certain character has? If not, is there a library that can do this? ...

C++: Printing ASCII Heart and Diamonds With Platform Independent

I'm developing a card playing game and would like to print out the symbol for hearts, diamonds, spades and clubs. My target platform will be Linux. In Windows, I know how to print out these symbols. For example, to print out a heart (in ASCII) I wrote... // in Windows, print a ASCII Heart #include <iostream> using std::cout; using ...

Convert Unicode to ASCII without changing the string length (in Java)

What is the best way to convert a string from Unicode to ASCII without changing it's length (that is very important in my case)? Also the characters without any conversion problems must be at the same positions as in the original string. So an "Ä" must be converted to "A" and not something cryptic that has more characters. Edit: @novali...

How can I iterate over every character in a given encoding using Python?

Is there a way to iterate over every character in a given encoding, and print it's code? Say, UTF8? ...

How do I get a µ character out of sqlite and onto a web-page?

On a Python driven web app using a sqlite datastore I had this error: Could not decode to UTF-8 column 'name' with text '300µL-10-10' Reading here it looks like I need to switch my text-factory to str and get bytestrings but when I do this my html output looks like this: 300�L-10-10 I do have my content-type set as: <meta ...

Using Objective C/Cocoa to unescape unicode characters, ie \u1234

Some sites that I am fetching data from are returning UTF-8 strings, with the UTF-8 characters escaped, ie: \u5404\u500b\u90fd Is there a built in cocoa function that might assist with this or will I have to write my own decoding algorithm. ...

Unicode Supplementary Multilingual Plane in Java

Hi everybody, I want to work with SMP(Supplementary Multilingual Plane) in Java. Actually, I want to print a character whose codepoint is more than 0xFFFF. I used this line of code: int hexCodePoint = Character.toCodePoint('\uD801', '\uDC02' ); to have the codepoint of a special character. But how can I print this unicode character to...

What are some common character encodings that a text editor should support?

I have a text editor that can load ASCII and Unicode files. It automatically detects the encoding by looking for the BOM at the beginning of the file and/or searching the first 256 bytes for characters > 0x7f. What other encodings should be supported, and what characteristics would make that encoding easy to auto-detect? ...

Linq filter issue involving a varchar(1) field

I have a field in my database that is varchar(1). I'm not permitted to change it. The only values for this field are 0 or 1. Here is the where clause of the linq query: where g.id_Group == idGroup && a.AccountOpen.Value == '1' My linq query generated the following sql where clause WHERE ([t1].[id_Group] = 1234) AND (UNICODE([t0].[A...

unicode in powershell with python? alternative shells in windows?

I want a shell that supports unicode in windows, powershell as it ships doesn't seem to. Powershell V2 (win7 x64) : PS C:\> powershell Windows PowerShell Copyright (C) 2009 Microsoft Corporation. All rights reserved. PS C:\> python Python 2.6.2 (r262:71605, Apr 14 2009, 22:46:50) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copy...

Ironpython string replace on 16-bit unicode character

In 2.6.4, is there a reason I can't do: "my string".replace(u'\u200E', '') without getting an index exception? This looks like a bug in IronPython but I'm not sure... ...

strcmp or _tcscmp in UNICODE

hi For comparing strings in UNICODE versions is it advisable to use strcmp or _tcscmp? Thanks in advance ...