utf-8

Python + PostgreSQL + strange ascii = UTF8 encoding error

I have ascii strings which contain the character "\x80" to represent the euro symbol: >>> print "\x80" € When inserting string data containing this character into my database, I get: psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80 HINT: This error can also happen if the byte sequence does not match the encodi ng ...

How do I get the number of visible characters from a UTF-8 encoded char*?

I have a UTF-8 encoded char*. Is there a standard function to calculate the number of visible characters represented by the byte array? I'm on Red Hat (RHEL 5). ...

Why isn't UTF-8 allowed as the "ANSI" code page?

The Windows _setmbcp function allows any valid code page... (except UTF-7 and UTF-8, which are not supported) OK, not supporting UTF-7 makes sense: Characters have non-unique representations and that introduces complexity and security risks. But why not UTF-8? As I understand it, the "ANSI" versions of the Windows API functions...

[AS3] Calling Php Script with UTF-8 POST variables

AS3 documentation says that Strings in AS3 are in UTF-16 format. There is a textbox on a Flash Clip where user can type some data. When a button is clicked, I want this data to be sent to a php script. I have everything set up, but it seems that the PHP script gets the data in UTF-16 format. The data in the database (which is utf-8) s...

What character encoding should I use for a web page containing mostly Arabic text? Is utf-8 okay?

What character encoding should I use for a web page containing mostly Arabic text? Is utf-8 okay? ...

C++ UTF-8 lightweight & permissive code?

Anyone know of a more permissive license (MIT / public domain) version of this: http://library.gnome.org/devel/glibmm/unstable/classGlib_1_1ustring.html ('drop-in' replacement for std::string thats UTF-8 aware) Lightweight, does everything I need and even more (doubt I'll use the UTF-XX conversions even) I really don't want to be car...

UTF-8 - Oracle issue

Possible Duplicate: DBD::Oracle and utf8 issue I set my NLS_LANG variable as 'AMERICAN_AMERICA.AL32UTF8' in the perl file that connects to oracle and tries to insert the data. However when I insert a record with one value having this 'ñ' character the sql fails. But if I use 'Ñ' it inserts just fine. What am I doing wrong he...

How do I obtain a code point integer from a 1 to 4 byte UTF-8 encoded sequence in Windows?

Hello, I am Patrick Niedzielski, a programmer for the Free Software 3D adventure game Humm and Strumm. I'm working on a minimal Unicode character class in C++. I currently have an array of four bytes representing a UTF-8 sequence. On GNU/Linux, I can just convert to UTF-32 with iconv(), but on Windows, I cannot do this. Is it possib...

VB.NET, MySQL and Unicode

How to input the textbox's unicode string to MySQL database. I changed utf8 charset the MySQL Database. I'm using VB.NET 2005 and MySQL Database for Window application. Please Help me. ...

How to convert all characters to their html entity equivalent using PHP

I want to convert this [email protected] to hello@domain.com I have tried: url_encode($string) this provides the same string I entered, returned with the @ symbol converted to %40 also tried: htmlentities($string) this provides the same string righ...

š and other char not visible

Hi All, I'm be wild about some CZECH char. On DB I've saved strings with č (and similar chars) and I'm able to show this only if I set my page charset to ISO-8859-1. It could be ok, but I've an UTF-8 XML file and when I try to get some string from this xml I've some problem. Basically the string from XML will not be shown correctly if...

OpenJPA & MySQL persist wrong encoded characters

Hi all, my mysql db has character encoding utf8. In QueryBrowser i can see special characters are correct. In appplication using openjpa i can see the same values also correct. But when I persist object into DB, I have correct values in application but incorrect in DB! When I restart application that special characters in application ar...

C++ std::string and UTF-8

Hello, I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so? Thanks ...

Reading UTF-8 XML and writing it to a file with Python

I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least i...

Character Encoding

My text editor allows me to code in several different character formats Ansi, UTF-8, UTF-8(No BOM), UTF-16LE, and UTF-16BE. What is the difference between them? What is commonly regarded as the best format (I'm using Python if that makes a diffrence)? ...

encoding changes when retrieving data with php from mysql table

Hi all! I have a very strange problem when retrieving data with php from a mysql table. Basically, two php files with the EXACT same content are given data with different encodings and i dunno why. Here's the code: $dbhost = 'localhost'; $dbuser = 'myuser'; $dbpass = 'mypass'; $conn = mysql_connect($dbhost, $dbuser, $dbpass) or die ('Er...

How to store characters like ♥☆ to DB?

Previous issue - was not able to store non-english characters: http://stackoverflow.com/questions/3008918/how-to-store-non-english-characters That was fixed by using UTF8. But realized today that symbols like ♥☆ are not stored correctly. They get converted to characters like ♥☆. How can this be fixed? ...

reading Twitter API with JSON framework

Hi, I'm building a twitter reader into an app. I'm using the JSON framework library from Stig Brautaset (v2.2.2) to parse the twitter API. I'm seeing some odd results on certain messages. I know that the Twitter API returns results in UTF8 format. I'm wondering if I'm doing something wrong when reading the JSON parsed fields. My code i...

java: how to convert a file to utf8

Hi, i have a file that have some non-utf8 caracters (like "ISO-8859-1"), and so i want to convert that file (or read) to UTF8 encoding, how i can do it? The code it's like this: File file = new File("some_file_with_non_utf8_characters.txt"); /* some code to convert the file to an utf8 file */ ... edit: Put an encoding example ...

Maven filter garbling special characters

I have a resource file with the following string in it, note the special characters: Questa funzionalità non è sostenuta: {0} {1} After Maven does its process-resources (which I need for something else) I get: Questa funzionalit� non � sostenuta: {0} {1} Please tell me there is an easy fix to this? ...