unicode

passing unicode string from C# exe to C++ DLL

Using this function in my C# exe, I try to pass a Unicode string to my C++ DLL: [DllImport("Test.dll", CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)] public static extern int xSetTestString(StringBuilder xmlSettings); This is the function on the C++ DLL side: __declspec(dllexport) int xSetTestStrin...

Unicode characters and IE

I just built a site that relies on certain Unicode characters like Ⓐ, but have just realized that IE doesn't show these characters? Is there some meta tag to get the browser to show it or how do you update IE to handle these Unicode characters? ...

Track unicode words from Twitter using Ruby and the Tweetstream API

I am trying to track a set of keywords from Twitter by using the Streaming API (can't post the link here because of spam limitations: google twitter streaming API). I am doing this inside Ruby, using the TweetStream gem: http://bit.ly/cODAWI The problem I have is that I want to track keywords that contain some unicode/UTF-8 character...

How to use unicode inside an xpath string? (UnicodeEncodeError)

I'm using xpath in Selenium RC via the Python api. I need to click an a element who's text is "Submit »" Here's the error that I'm getting: In [18]: sel.click(u"xpath=//a[text()='Submit \xbb')]") --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent...

SQLAlchemy automatically converts str to unicode on commit

Hello, When inserting an object into a database with SQLAlchemy, all it's properties that correspond to String() columns are automatically transformed from <type 'str'> to <type 'unicode'>. Is there a way to prevent this behavior? Here is the code: from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData from sql...

SQLite/iPhone read copyright symbol

Hi All, I am having problems reading the copyright symbol from a sqlite db that I have for my App that I am developing. I import the information manually, ie, from an excel sheet. I have tried two ways of doing it and failed with both: Tried replacing the copyright symbol with "\u00ae" (unicode combination) within excel and then import...

How can I make python be more friendly regarding reading and writing Unicode text files?

I found that even modern Python versions (like 3.x) are not able to detect BOM on text files. I would like to know if there is any module that could add this missing feature to Python by replacing the open() and codecs.open() functions for reading and writing text files. ...

How can you print a string using raw_unicode_escape encoding in python 3?

The following code with fail in Python 3.x with TypeError: must be str, not bytes because now encode() returns bytes and print() expects only str. #!/usr/bin/python from __future__ import print_function str2 = "some unicode text" print(str2.encode('raw_unicode_escape')) How can you print a Unicode string escaped representation using p...

Perl latin-9? Unicode - need to add support

I have an application that is being expanded to the UK and I will need to add support for Latin-9 Unicode. I have done some Googling but found nothing solid as to what is involved in the process. Any tips? Here is some code (Just the bits for Unicode stuff) use Unicode::String qw(utf8 latin1 utf16); # How to call $encoded_txt = $self-...

Python encoding for pipe.communicate

I'm calling pipe.communicate from Python's subprocess module from Python 2.6. I get the following error from this code: from subprocess import Popen pipe = Popen(cwd) pipe.communicate( data ) For an arbitrary cwd, and where data that contains unicode (specifically 0xE9): Exec. exception: 'ascii' codec can't encode character u'\x...

Matching Unicode Dashes in Java Regular Expressions?

I'm trying to craft a Java regular expression to split strings of the general format "foo - bar" into "foo" and "bar" using Pattern.split(). The "-" character may be one of several dashes: the ASCII '-', the em-dash, the en-dash, etc. I've constructed the following regular expression: private static final Pattern titleSegmentSeparator...

Using end of word mark with unicode in regular expressions in Python

The following matches in Idle, but does not match when run in a method in a module file: import re re.search('\\bשלום\\b','שלום עולם',re.UNICODE) while the following matches in both cases: import re re.search('שלום','שלום עולם',re.UNICODE) (Notice that stackoverflow erroneously switches the first and second items in the line above ...

Displaying Unicode Characters

I already searched for answers to this sort of question here, and have found plenty of them -- but I still have this nagging doubt about the apparent triviality of the matter. I have read this very interesting an helpful article on the subject: http://www.joelonsoftware.com/articles/Unicode.html, but it left me wondering about how one w...

How can I convert a string into unicode string in Perl

how convert string into Unicode string in Perl. I am looking some attribute in LDAP which accepts only Unicode string . So i want to convert normal string to Unicode string ...

Weird SQL Server 2005 Collation difference between varchar() and nvarchar()

Can someone please explain this: SELECT CASE WHEN CAST('iX' AS nvarchar(20)) > CAST('-X' AS nvarchar(20)) THEN 1 ELSE 0 END, CASE WHEN CAST('iX' AS varchar(20)) > CAST('-X' AS varchar(20)) THEN 1 ELSE 0 END Results: 0 1 SELECT CASE WHEN CAST('i' AS nvarchar(20)) > CAST('-' AS nvarchar(20)) THEN 1 ELSE 0 E...

Send parameters to Web Service Persian ?

Display information in Farsi, but I have a problem when my site for web services can be sent a character "?" are displayed. pages are saved with Unicode(utf-8 with signature)codepage 65001 and the following tags in my master page : <'html xmlns="http://www.w3.org/1999/xhtml" lang="fa" xml:lang="fa" > <'meta http-equiv="Conte...

'' Not a valid unicode character, but in the unicode character set?

Short story: I can't get an entity like '𠂉' to store in a MySQL database, either by using a text field in a Ruby on Rails app (with default UTF-8 encoding) or by inputting it directly with a MySQL GUI app. As far as I can tell, all Chinese characters and radicals can be entered into the database without problem, but not these rarely typ...

Does Access have any issues with unicode capable data types like nvarchar in SQL Server?

I am using Access 2003 as a front end UI for a SQL Server 2008 database. In looking at my SQL Server database design I am wondering if nvarchar was the right choice to use over varchar. I chose nvarchar because I thought it would be useful in case any characters represented by unicode needed to be entered. However, I didn't think about a...

C++ project type: unicode vs multi-byte; pros and cons

I'm wondering what the Stack Overflow community thinks when it comes to creating a project (thinking primarily c++ here) with a unicode or a multi-byte character set. Are there pros to going Unicode straight from the start, implying all your strings will be in wide format? Are there performance issues / larger memory requirements becau...

copy text (Indian language- GUjarati) from word document to web page text area problem.

Hi all, I am developing one site in Indian language (Gujarati). My problem is as below: My client wants that they able to copy Gujarati text from word document and paste into the Text area. But when i copy text from word doc and paste into text area the its get converted to the English letters. http://www.chanakyanipothi.com/gujchan...