unicode

Why do Perl string operations on Unicode characters add garbage to the string?

A quick question for a change. Perl: $string =~ s/[áàâã]/a/gi; #This line always prepends an "a" $string =~ s/[éèêë]/e/gi; $string =~ s/[úùûü]/u/gi; This Regex should convert "été" into "ete". What it does instead is converting it to "aetae". In other words, it prepends an "a" to every matched element. Even "à" is converted to "aa". ...

file name with special characters like "é" NOT FOUND

I have a folder on my website just for random files. I used php opendir to list all the files so i can style the page a bit. But the files that I uploaded with special characters in them don't work. when i click on them it says the files are not found. but when i check the directory, the files are there. seems like the links are wrong. a...

Which packages for many unicode-characters?

I'm trying to create a LaTeX document with as many as Unicode characters as possible. My header is as follows: \documentclass[12pt,a4paper]{article} \usepackage[utf8x]{inputenc} \usepackage{ucs} \pagestyle{empty} \usepackage{textcomp} \usepackage[T1]{fontenc} \begin{document} The Unicode characters which follow in the document-body a...

Getting character from the Unicode code-point - C++

I have got two questions. 1 - I get Unicode code-points and how do I get the character associated with this code-point? Something like: int code_point = 0xD24; char* chr = (char*) code_point; But the above code fails by throwing exception. 2 - Suppose the code-point is stored in a file and I read the code-point to a string, how do I...

LoadLibraryW doesn't work while LoadLibraryA does the job

I have written some sample program and DLL to learn the concept of DLL injection. My injection code to inject the DLL to the sample program is as follows (error handling omitted): std::wstring dll(L"D:\\Path\\to\\my\\DLL.dll"); LPTHREAD_START_ROUTINE pLoadLibraryW = (LPTHREAD_START_ROUTINE)GetProcAddress(hKernel32, "LoadLibraryW")...

How to handle unicode strings in a XeLaTeX document?

Hi, an earlier question led me to XeLaTex (it was about LaTeX and Unicode). So I've got now this document: \documentclass[a4paper]{article} \usepackage[cm-default]{fontspec} \usepackage{xunicode} \usepackage{xltxtra} \setmainfont[Mapping=tex-text]{Arial} \begin{document} গ a ä ͷ \end{document} With the font "Arial" only the a and th...

django Unicode GET Parameter Values

I'm trying to get a GET parameter value that looks like this: http://someurl/handler.json?&q=%E1%F8%E0%F1%F8%E9 The q parameter in this case is Hebrew. I'm trying to read the value using the following code: request.GET.get("q", None) I'm getting gybrish instead of the correct text. Any idea what's wrong here? Am I missing some ...

When and Why Should I Use TStringBuilder?

I converted my program from Delphi 4 to Delphi 2009 a year ago, mainly to make the jump to Unicode, but also to gain the benefits of all those years of Delphi improvements. My code, of course, is therefore all legacy code. It uses short strings that have now conveniently all become long Unicode strings, and I've changed all the old ANSI...

How to convert Unicode NCR form to its original form in PHP?

To avoid "monster characters", I choose Unicode NCR form to store non-English characters in database (MySQL). Yet, the PDF plugin I use (FPDF) do not accept Unicode NCR form as a correct format; it displays the data directly like: 這個一個例子 but I want it to display like: 這個一個例子 Is there any met...

Unicode handling in ReportLab

I am trying to use ReportLab with Unicode characters, but it is not working. I tried tracing through the code till I reached the following line: class TTFont: # ... def splitString(self, text, doc, encoding='utf-8'): # ... cur.append(n & 0xFF) # <-- here is the problem! # ... (This code can be found in ...

Delphi 7 Personal, MySQL using libmysql.dll + UTF8

Hi, I'm using Delphi 7 Personal. To access MySQL database I'm using libmysql.dll + very simple wrapper, which is good enough for me. Except one thing ... it doesn't seem to handle Utf8... is that possible somehow to pass Utf8 strings from libmysql to Delphi? Please keep in mind I'm not using commercial delphi, this means no ADO / dbExpr...

Eclipse French support

I need to enter some French chars in eclipse. How do I configure eclipse to enter French? I do have all the fonts that come with default eclipse packaging. ...

Python xml.dom.minidom Unicode

Hi there, I'm trying to create an xml document in python, however some of the strings i'm working with are encoded in unicode. Is there a way to create a text node using xml.dom.minidom using unicode strings? Is there another module I can use? Thanks. ...

Unicode Encoding and decoding issues in QRCode

I am trying to generate UTF-8 QRCode so that I can encore accents and Unicode characters. To test it, I am using many decoding solution : http://zxing.org/w/decode.jspx - The zxing project also used in Android http://www.drhu.org/QRCode/QRDecoder.php - a PHP Decoder http://zbar.sf.net - The ZBar bar code reader - OpenSource and C proj...

Converting Unicode strings to escaped ascii string

How can I convert this string: This string contains the unicode character Pi(π) into an escaped ascii string: This string contains the unicode character Pi(\u03a0) and vice versa ? The current Encoding available in C#, converts the π character into "?". I need to preserve that character. ...

Data in Sql Server should use Unicode?

I want to store English, French, German, Italian, and Spanish in a Sql Server 2005 database to be used with a .NET application. Can I get away with not using Unicode? Will there be any issues with these languages? ...

Converting non-unicode SQL Server data and stored procs to Unicode

I need to convert a non-unicode SQL Server 2005 database to a unicode based database. I have hundreds of stored procs and of course the data is stored in varchar. I know that I need to change all the data types to the unicode equivalent (varchar to nvarchar) but don't I have to change how the stored procs are written or will they conti...

Case insensitive search in Unicode in C++ on Windows

I asked a similar question yesterday, but recognize that i need to rephase it in a different way. In short: In C++ on Windows, how do I do a case-insensitive search for a string (inside another string) when the strings are in unicode format (wide char, wchar_t), and I don't know the language of the strings. I just want to know whether t...

Python zlib output, how to recover out of mysql utf-8 table?

In python, I compressed a string using zlib, and then inserted it into a mysql column that is of type blob, using the utf-8 encoding. The string comes back as utf-8, but it's not clear how to get it back into a format where I can decompress it. Here is some pseduo-output: valueInserted = zlib.compress('a') = 'x\x9cK\x04\x00\x00b\x00b' ...

Split Unicode String with Ruby

How can I split a String by Unicode range in Ruby. I wanted to split under \u1000 and over \u1000 with comma. For example, I wanted to split this string... I love ျမန္မာ to this... I love, ျမန္မာ You may not see the Unicode Characters in my example. It's Unicode range \u1000 and over. Thanks. ...