utf-8

C++ ifstream UTF8 first characters

Why does a file saved as UTF8 (in Notepad++) have this character in the beginning of the fstream I opened to it in my c++ program? ´╗┐ I have no idea what it is, I just know that it's not there when I save to ASCII. UPDATE: If I save it to UTF8 (without BOM) it's not there. How can I check the encoding of a file (ASCII or UTF8, ev...

certain utf characters do not show up on browsers and fails python script

Hi All, I generated a SQL script from a C# application on Windows 7. The name entries have utf8 characters. It works find on Windows machine where I use a python script to populate the db. Now the same script fails on Linux platform complaining about those special characters. Similar things happened when I generated XML file containing...

php website not showing correctly in utf-8

I have a website which has some non standard characters such as ë, Ç etc. The website uses ISO-8859-1 as it's character encoding, however at this point I want to switch it to UTF-8 for some reasons related to rss feeds. When i change the character encoding to utf-8 the mentioned characters are displayed incorrectly. I set the charset ...

Zend Search Lucene and Accented Characters

Hello, I'm trying to find a way in Zend_Search_Lucene to pull off the following scenario: Let's say we have a user and her name is Aïcha (note the special character). If I'm searching the index for Aicha (without the special derivative of i), I'd like for Aïcha to be returned in the results. Is there something special I need to do wh...

UTF-8 file: Filter everything but the image

I have a UTF-8 encoded file and would like to pop an output stream on the image part of the file. Any suggestions on how to filter everything out of the stream except the image data? ...

UTF8 characters not printed as such in Drupals HTML.

I am trying to debug a nasty utf-8 problem, and do not know where to start. A page contains the word 'categorieën', wich should be categorieën. Clearly something is wrong with the UTF-8. This happens with all these multibite characters. I have scanned the gazillion topics here on UTF8, but they mostly cover the basics, not this situat...

Delphi 2010 - IBX - UTF8 - dbmemo problem

I am migrating an application from Delphi 6 - IBX - Firebird 1.5 that works great to Delphi 2010 - Firebird 2.1 - UTF8 database. The problem is that if I use a TDBMemo to display data from a BLOB I get the following error: Debugger Exception Notification Project accedo.exe raised exception class EAccessViolation with message 'Ac...

Crazy characters - trying to insert into UTF-8

Hello. I'm trying to create a script that copies data from an old legacy mysql database into my new utf-8 formatted database. One particular field is causing me trouble, its a latin1 field - and one record has the following in it: !-#$%'&*£¥ When the update is performed, I get the following error message: Zend_Db_Statement_Except...

PHP UTF-encoded URL-string

When I type in Firefox (in the address line) URL like http://www.example.com/?query=Траливали, it is automatically encoded to http://www.example.com/?query=%D2%F0%E0%EB%E8%E2%E0%EB%E8. But URL like http://www.example.com/#ajax_call?query=Траливали is not converted. Other browsers such as IE8 do not convert query at all. The question i...

WideCharToMultiByte problem

I have the lovely functions from my previous question, which work fine if I do this: wstring temp; wcin >> temp; string whatever( toUTF8(getSomeWString()) ); // store whatever, copy, but do not use it as UTF8 (see below) wcout << toUTF16(whatever) << endl; The original form is reproduced, but the in between form often contains extr...

java.sql.SQLException: Incorrect string value: '\xF3\xBE\x8D\x81'

I am getting the following exception while trying to save some Tweets, Caused by: java.sql.SQLException: Incorrect string value: '\xF3\xBE\x8D\x81' for column 'twtText' at row 1 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956) ...

Unicode characters become question marks after inserting into database

When I insert some text written in unicode into database, they become question marks. Database encoding is set to utf-8. What else may be incorrect? When I check in phpmyadmin there are question marks inserted only! This is the code I use for connecting to database: define ("DB_HOST", "localhost"); // set database host define ("DB_USER...

Character encoding while reading data using Java-JDBC from Oracle database

We have data stored in oracle 10g db which contains french character set. The requirement is to read the data and generate a output file using Java. I checked the validity of the data in Oracle db via SQL*plus and it looks good. From windows: set NLS_LANG=AMERICAN.AL32UTF8 sqlplus scott/tiger sql> select billing_address from MYTABLE ...

RTFString Containing NSASCIIStringEncoding Special Characters?

I have a string in my cocoa GUI that needs to have special formatting (fonts, colors, etc.). Naturally, I'm using an attributed string. For convenience, I Init the string as an RTF: NSString *inputString = @"This string has special characters"; NSString *rtfString = [NSString stringWithFormat:@"{@"***LENGTHY RTF FORMATTING STRING *** %@...

Reading utf8-encoded data from a connection, using Go.

I can easily write a string to a connection using io.WriteString. However, I can't seem to easily read a string from a connection. The only thing I can read from the connection are bytes, which, it seems, I must then somehow convert into a string. Assuming the bytes represent a utf8-encoded string, how would I convert them to string fo...

paypal utf8 character

sorry for my bad english. I am a php developer. I have a problem with utf8 character of Russian language. I am using paypal sandbox testing for purchase the product. When the data of the customer is store in the database it not showing the proper data(Russian character). So i have store the data directly in the text file and check if ...

No UTf-8 when write IPTC data in JPG with iptcembed

Hi, I use this function function iptc_make_tag($rec, $data, $value){ $length = strlen($value); $retval = chr(0x1C) . chr($rec) . chr($data); if($length < 0x8000) { $retval .= chr($length >> 8) . chr($length & 0xFF); } else { $retval .= chr(0x80) . chr(0x04) . chr(...

Easy way to convert &#XXXX; from HTML to UTF-8 xml either programmaticaly in .Net or using tools

HTML and XML are not same just given for illustration. For input HTML file <p class=MsoNormal style='tab-stops:.5in'><b><span style='mso-tab-count:3'>                                    </span></b><b><span lang=AR-SY dir=RTL style='mso-bidi-language:AR-SY'>&#1593;&#1586;&#1578; &#1575;&#1576;&#1585;&#1575;&#1607;&#1610;&#1605; &#1575;...

How to do it in Ruby on rails

This is a C# code: byte[] pb = System.Text.Encoding.UTF8.GetBytes(policy.ToString()); // Encode those UTF-8 bytes using Base64 string policyB = Convert.ToBase64String(pb); // Sign the policy with your Secret Key using HMAC SHA-1. System.Security.Cryptography.HMACSHA1 hmac = new System.Security.Cryptography.HMACSHA1(); hmac.Key = Syste...

Aspell decodes dictionary file as latin1 even if both environment and aspell config specifies the encoding as UTF-8

Update Apparently the solution to this is to use yet another configuration parameter to set the encofing: --encodig=UTF-8 on the command line. For example: zby@tvm1:/home/xpapers$ aspell --lang=en create master ./dictionary.local < w Warning: The word "Pérez" is invalid. The character '©' (U+A9) may not appear in the middle of a word...