questions about unicode

latin characters showing in some parts of the page and not others

the page in question is Apple Amor You can see that in the footer the spanish vowels seem to be showing properly , but in the slide down bar(header) they get messed up. Any ideas why ? ...

html

unicode

character-encoding

latin

[Android] reading unicode text from assets

Trying to read an utf-8 encoded file in android... InputStreamReader reader = new InputStreamReader(assets.open("data.txt"), "UTF-8"); BufferedReader br = new BufferedReader(reader); String line; //The line below throws an IOException!! line = br.readLine(); What's wrong with this code? ...

android

unicode

file-io

How to convert hebrew (unicode) to Ascii in c#?

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII. This is file creation method which triggers on ButtonClick protected void ToFile(object sender, EventArgs e) { filename = Transactions.generateDateYMDHMS(); string path = string.Format("{0}{1}.001", Server.MapPath("~/transact...

c#

unicode

encoding

Handling unicode data in XMLRPC

I have to migrate data to OpenERP through XMLRPC by using TerminatOOOR. I send a name with value "Rotule right Aurélia". In Python the name with be encoded with value : 'Rotule right Aur\xc3\xa9lia ' But in TerminatOOOR (xmlrpc client) the data is encoded with value 'Rotule middle Aur\357\277\275lia' So in the server side, the data value...

Arguments for and against supporting std::wstring exclusively in cross-platform library

I'm currently developing a cross-platform C++ library which I intend to be Unicode aware. I currently have compile-time support for either std::string or std::wstring via typedefs and macros. The disadvantage with this approach is that it forces you to use macros like L("string") and to make heavy use of templates based on character type...

how to determine if NSString is normalized in NFD

Hi, i need to determine if a given NSString is in NFD form. how do i do that ? Context : the file path i get from mac OS (in form of NSString) is in canonical decomposed form ( NFD ) .. this is true especially when the filesystem is HFSPlus. http://developer.apple.com/mac/library/technotes/tn/tn1150.html#CanonicalDecomposition i nee...

unicode

nsstring

normalization

How to convert unicode charaters to unicode decimal entities php?

Hi, I have unicode characters in MySQL tables. I will print the data in the web pages. While printing it in the pages I am generating the 'Share This' buttons dynamically to share each record in that table (which is in Punjabi). So the output in the page looks fine. But while sharing the same content in 'Share This' the destination pag...

Can I get SQLite to string instead of unicode for TEXT in Python?

AFAIK SQLite returns unicode objects for TEXT in Python. Is it possible to get SQLite to return string objects instead? ...

php-excel-reader - problem with UTF-8

Hi, I'm using php-excel-reader 2.21 for converting XLS file to CSV. I wrote a simple script to do that, but I have some problems with unicode characters. It does not return values from some cells. For example it doesn't have problems with cell content ceník položek but have problems with nákup, VÝROBCE, PÁS, HRUBÝ,NÁKLADNÍ and some othe...

php

excel

unicode

Arabic being displayed in giberish and question marks

Hey, I created a php website that would simply load the text from a mysql database, when I open it in a browser the Arabic text is presented in gibberish but then when I change the encoding of my browser to UTF-8 it's displayed properly, how can I force the encoding to be UTF-8 so users don't have to change it? The menu part of the webs...

Removing non-ascii characters from any given stringtype in Python

>>> teststring = 'aõ' >>> type(teststring) <type 'str'> >>> teststring 'a\xf5' >>> print teststring aõ >>> teststring.decode("ascii", "ignore") u'a' >>> teststring.decode("ascii", "ignore").encode("ascii") 'a' which is what i really wanted it to store internally as i remove non-ascii characters. Why did the decode("ascii give out a uni...

Converting unicode objects with non-ascii symbols in them into strings objects (in python)

I want to send chinese characters to be translated by an online service, and have the resulting english string returned. I'm using simple json and urllib for this. And yes, i am declaring. # -*- coding: utf-8 -*- on top of my code. The thing is, now everything works fine if i feed urllib a string type object, even if that object c...

Goofy Unicode problem: mï ¿ ½

I have some text coming into a database that apparently has some sort of Unicode issue. the literal text coming in is "5 mï ¿ ½ in area", which appears to be some sort of unit of measure, but I can't sort out what the meaning is in context. Searching Google shows many similar results, so this is apparently a common set of symbols. ...

unicode

encoding

Converting UTF-8 Characters to Upper/Lower case C++

Hello All, I have a string that contains UTF-8 Characters, and I have a method that is supposed to convert every character to either upper or lower case, this is easily done with characters that overlap with ASCII, and obviously some characters cannot be converted, e.g. any Chinese character. However is there a good way to detect and co...

Django, Javascript, JSON and Unicode

I have a frustrating problem. I have a Django web app. The model contains various CharField columns. When I convert these strings into JSON using json.dumps, the strings come out as Unicode like this: "{'field': u'value'}" and so forth. However, I need to pass this to Javascript, and the jQuery parser croaks on this format. What I am ...

Javascript to validate password

Hi, I am new to regular expressions.Can anyone help me in writing a regular expression on the following description. The password contains characters from at least three of the following five categories: English uppercase characters (A - Z) English lowercase characters (a - z) Base 10 digits (0 - 9) Non-alphanumeric (For example: !, $,...

Get warning for python string literals not prefixed with 'u'

To follow best practices for Unicode in python, you should prefix all string literals of characters with 'u'. Is there any tool available (preferably PyDev compatible) that warns if you forget it? ...

python

unicode

pydev

What charset to use to store russian text into javascript files as an array

I am creating a coldfusion page, that takes language translation data stored in a table in my database, and makes static js files for each language pairing of english to _ etc... I am now starting to work on russian, I was able to get the other languages to work fine.. However, when it saves the file, all the text looks like question m...

unicode

coldfusion

translation

How to convert unicode string like u'\\u4f60\\u4f60' to u'\u4f60\u4f60' in Python?

I capture the string from a html source file using regex: f = open(rrfile, 'r') p = re.compile(r'"name":"([^"]+)","head":"([^"]+)"') match = re.findall(p, f.read()) And I've tried: >>> u'\\u4f60\\u4f60'.replace('\\u', '\u') u'\\u4f60\\u4f60' >>> u'\\u4f60\\u4f60'.replace(u'\\u', '\u') u'\\u4f60\\u4f60' >>> u'\\u4f60\\u4f60'...

python

unicode

replace

Are all kanji characters UTF8 3 byte long ?

Can someone please confirm that all Kanji characters in chinese are UTF8 3 byte long. ...

unicode

utf-8

character-encoding