utf-8

Python UTF-16 WAVY DASH encoding question / issue

Hi. I was doing some work today, and came across an issue where something "looked funny". I had been interpreting some string data as utf-8, and checking the encoded form. The data was coming from ldap (Specifically, Active Directory) via python-ldap. No surprises there. So I came upon the byte sequence '\xe3\x80\xb0' a few times, which...

What groupware/collaboration system to choose?

I am looking for a good (stable enough, intuitive enough, and the more technologically modern and advanced - the better) free PHP-based online groupware/collaboration system. 100% UTF-8 is a requirement. OOP-style code is an advantage. And collaborative mindmapping would be a cool feature to have. The team is of ~20 people. Any suggestio...

wordpress plugin encoding not working

i am using a plugin in wordpress that it should be translated into persian (utf8), the page encoding is correct but when i edit the plugin in control panel editor's page it will become question marks, for example see this page http://farsi.tikatel.com/register/ the form in body content is plugin which has been translated into persian but...

Changing default encoding of python ?

I have many "can't encode" and "can't decode" problems with python when I run my applications from console. But in Eclipse Pydev IDE, default character encoding is set to utf-8 and I'm fine. I searched around for setting default encoding, and people say that python deletes the sys.setdefaultencoding function on startup and we can not us...

How do I make Perl code DWIM with UTF8?

The utf8 pragma and utf8 encodings on filehandles have me confused. For example, this apparently straightforward code... use utf8; print qq[fü]; To be clear, the hex dump on "fü" is 66 c3 bc which if I'm not mistaken is proper UTF8. That prints 66 fc which is not UTF8 but Unicode or maybe Latin-1. Turn off use utf8 and I get 66 c3 ...

Windows C API for UTF8 to 1252

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 -> UTF16 -> 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call? I should probably just pull in the iconv library, but am feeling lazy. Thanks ...

How do I transcode a javascript string to iso-8859-1?

I'm writing a chrome extension that works with a website that uses iso-8859-1. Just to give some context what my extension does is making posting in the site's forums quicker by adding a more convenient post form. The value of the textarea where the message is written is then sent through an ajax call (using jQuery). If the message cont...

mysql result is "special character"-insensitive

Hi, It seems that when i alter a mysql table (on a utf-8 table/columns) unique that it returns a duplicate entry error. Example: ALTER TABLE name ADD UNIQUE(name) error: Duplicate entry 'Adé' for key 'name_UNIQUE' I think it's because of the follow to rows in my database Ade, Adé Is it possible to alter a table unique with spe...

C++ - How to read Unicode characters( Hindi Script for e.g. ) using C++ or is there a better Way through some other programming language?

Hi:) I have a hindi script file like this: 3. भारत का इतिहास काफी समृद्ध एवं विस्तृत है। I have to write a program which adds a position to each and every word in each sentence. Thus the numbering for every line for a particular word position should start off with 1 in parentheses. The output should be something like this. 3. भारत...

How can I convert a cp1251 byte array to a utf8 String?

We don't have the cp1251 code page available on a phone, so new String( data, "cp1251" ) doesn't work. We need a function with signature something like String ArrayCp1251toUTF8String(byte data[]); ...

Getting ’ instead of an apostrophe(') in PHP

I'v tried converting the text to or from utf8… didn't seem to help Im getting: "It’s Getting the Best of Me" It should be: "It’s Getting the Best of Me" Im getting this data from a url -> http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20x02&exact=0 ...

Byte array to UTF8 CString

Hi all, I'm using Visual Studio 2008 (C++). How do I create a CString (in a non-Unicode app) from a byte array that has a string encoded in UTF8 in it? Thanks, kreb EDIT: Clarification: I guess what I'm asking is.. CStringA doesn't seem to be able to interpret a UTF8 string as UTF8, but rather as ASCII or the current codepage (I th...

How to find out if string has already been URL encoded?

How could I check if string has already been encoded? For example, if I encode TEST==, I get TEST%3D%3D. If I again encode last string, I get TEST%253D%253D, I would have to know before doing that if it is already encoded... I have encoded parameters saved, and I need to search for them. I don't know for input parameters, what will th...

Delphi dbExpress and Interbase: UTF8 migration steps and risks?

Currently, our database uses Win1252 as the only character encoding. We will have to support Unicode in the database tables soon, which means we have to perform this migration for four databases and around 80 Delphi applications which run in-house in a 24/7 environment. Are there recommendations for database migrations to UTF-8 (or UNICO...

normalizing accented characters in MySQL queries

I'd like to be able to do queries that normalize accented characters, so that for example: é, è, and ê are all treated as 'e', in queries using '=' and 'like'. I have a row with username field set to 'rené', and I'd like to be able to match on it with both 'rene' and 'rené'. I'm attempting to do this with the 'collate' clause in MyS...

How do i create a unicode filename in linux?

I heard fopen supports UTF8 but i dont know how to convert an array of shorts to utf8 How do i create a file with unicode letters in it? I prefer to use only built in libraries (no boost which is not installed on the linux box). I do need to use fopen but its pretty simple to. ...

How do I copy a file with a UTF-8 filename to another UTF-8 filename in Perl on Windows?

For example, given an empty file テスト.txt, how would I make a copy called テスト.txt.copy? My first crack at it managed to access the file and create the new filename, but the copy generated テスト.txt.copy. Here was my first crack at it: #!/usr/bin/env perl use strict; use warnings; use English '-no_match_vars'; use File::Basename; ...

howto : output utf-8(kannada) characters in windows terminal using java

am working on a java(tomcat) app. that sometimes writes to stdout. But I notice that indic languages(say, kannada) turn out as ?????? characters on the std. windows console(terminal) on Windows Vista (SP1 Home premium 64-bit). I know that I could run tomcat from within emacs(GNU Emacs 23.1.50.1 (i386-mingw-nt6.0.6001)) so I could see th...

Reading unicode characters from text file in Delphi 2009

I have the following piece of code to read Japanese Kanji characters from UTF-8 format Text file and then load it into Memo. Var F:textFile; S:string; Begin AssignFile(F,'file.txt'); Reset(F); While not EoF(F) do Begin Readln(F,S); Memo1.Lines.Add(S); End; CloseFile(F); End; But instead of characters I see some set of totall...

Vim, iconv+nr2char and iconv+"\x.."

echo strtrans(iconv( "\x80", "utf-8", "utf-32")) Outputs «??» and echo strtrans(iconv(nr2char(0x80), "utf-8", "utf-32")) outputs «<80>». Why? (zyx:~) % LANG=C vim --version VIM - Vi IMproved 7.2 (2008 Aug 9, compiled Feb 12 2010 07:37:05) Included patches: 1-303 Modified by Gentoo-7.2...