unicode

How do I HTML-/ URL-Encode a std::wstring containing Unicode characters?

Hi, I have another question yet. If I had a std::wstring looking like this: ドイツ語で検索していてこちらのサイトにたどり着きました。 How could I possibly get it to be URL-Encoded (%nn, n = 0-9, a-f) to: %E3%83%89%E3%82%A4%E3%83%84%E8%AA%9E%E3%81%A7%E6%A4%9C%E7%B4%A2%E3%81%97%E3%81%A6%E3%81%84%E3%81%A6%E3%81%93%E3%81%A1%E3%82%89%E3%81%AE%E3%82%B5%E3%82%A4...

Unicode in Perl not working

I have some text files which I am trying to transform with a Perl script on Windows. The text files look normal in Notepad+, but all the regexes in my script were failing to match. Then I noticed that when I open the text files in NotePad+, the status bar says "UCS-2 Little Endia" (sic). I am assuming this corresponds to the encoding U...

Unicode character for PUZZLE PIECE ?

Is there an Unicode symbol representing puzzle pieces? There are lots of seldomly used dingbats in Unicode, and I sort of don't really remember - but suspect there's one for this too. However, I couldn't find anything like it in gucharmap, because it's probably not complete (lacks Klingon!). And neither in the tables on unicode.org/chart...

Java unicode characters error in cmd

I have the following class in Java which prints "Hello World" in portuguese: public class PrintUnicode { public static void main(String[] args) { System.out.println("Olá Mundo!"); } } I am using Eclipse, so I exported the project to a Runnable Jar File. After that, I went to cmd (Windows 7) and ran the generated jar fi...

Android WebView UTF-8 not showing

I have a webview and am trying to load simple UTF-8 text into it. mWebView.loadData("將賦予他們的傳教工作標示為", "text/html", "UTF-8"); But the WebView displays ANSI/ASCII garbage. Obviously an encoding issue, but what am I missing in telling the webview to display the Unicode text? This is a HelloWorld app. ...

SWT StyledText and unicode support

I need to display a set of strings which contains characters from GB18030 set in a SWT StyledText field. Most of the them got displayed correctly but some of them are displayed like boxes. Is this because I am support to install certain font that support GB18030 character set? If so, what font should I have installed? Or should I...

Encoding errors in .jspx

I'm currently trying to deploy some RSS feeds on a WebLogic Application Server. The feeds' views are .jspx files, like the one below: <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" xmlns:jsp="http://java.sun.com/JSP/Page" xmlns:c="http://java.sun....

Should I be able to quote a leading or trailing dollar sign ($) inside a word boundary in Java Regular Expression?

I'm having trouble getting regular expressions with leading / trailing $'s to match in Java (1.6.20). From this code: System.out.println( "$40".matches("\\b\\Q$40\\E\\b") ); System.out.println( "$40".matches(".*\\Q$40\\E.*") ); System.out.println( "$40".matches("\\Q$40\\E") ); System.out.println( " ------ " ); System.out.println( "40$"...

CRLF translation with Unicode in Perl

I'm trying to write to a Unicode (UCS-2 Little Endian) file in Perl on Windows, like this. open my $f, ">$fName" or die "can't write $fName\n"; binmode $f, ':raw:encoding(UCS-2LE)'; print $f, "ohai\ni can haz unicodez?\nkthxbye\n"; close $f; It basically works except I no longer get the automatic LF -> CR/LF translation on output that...

What are some areas of C++ code that could be affected by porting to Visual 2005 and changing to unicode?

We recently ported over legacy code to now use Visual Studio 2005 and unicode. What are the key areas that are affected by switching to the unicode character set? ...

Help on Regular Expression problem

Hi, i wonder if it's possible to make a RegEx for the following data pattern: '152: Ashkenazi A, Benlifer A, Korenblit J, Silberstein SD.' string = '152: Ashkenazi A, Benlifer A, Korenblit J, Silberstein SD.' I am using this Regular Expression (Using Python's re module) to extract these names: re.findall(r'(\d+): (.+), (.+), (.+), (...

ANSI, ASCII, Unicode and encoding confusion with Python

Hi! I was happily using BeautifulSoup and I'm also using a text file as input parameters of my Python script. I then came across the famous "UnicodeEncodeError" error. I've been reading questions here at SO but I'm still confused. What does ASCII got to do with all of these? What encoding do I use on my text editor (Notepad++)? ANSI? ...

Is there a programming language with full and correct Unicode support?

Most programming languages have some support for Unicode, but all have some more or less documented corner cases, where things won't work correctly. Examples Java: reverse() in StringBuilder/StringBuffer work correctly. But length(), charAt(), etc. in String do not if a character needs more than 16bit to encode. C#: Didn't find a co...

How to send integers to arduino via serial?

hi fellows, i want to send integers to arduino via serial. for example when i send "1" the data received by arduino is "49" and when i send "a" the data received by arduino is "97" there were two functions in python; ord() and unichr() for example unichr(97) = u"a" and ord(u"a")=97 but im not that good at C language. ...

Perl Text::CSV_XS Encoding Issues

I'm having issues with Unicode characters in Perl. When I receive data in from the web, I often get characters like “ or €. The first one is a quotation mark and the second is the Euro symbol. Now I can easily substitute in the correct values in Perl and print to the screen the corrected words, but when I try to output to ...

Using Unicode in a C++ source file

I'm working with a C++ sourcefile in which I would like to have a quoted string that contains Asian Unicode characters. I'm working with QT on Windows, and the QT Creator development environment has no problem displaying the Unicode. The QStrings also have no problem storing Unicode. When I paste in my Unicode, it displays fine, somethi...

How to extract a UTF-8 string (In Arabic) from a MySQL DB and echo to screen using PHP

Hi, I have a MySQL db, i've set collation = utf8_unicode_ci. I'm trying to fetch the value through PHP but i'm getting "???" instead of the actual string. I have read about this subject and tried using mb_convert_encoding but it didn't work, what am I missing? Can someone please post a code snippet that actually pulls a value from a ...

how to deal with unicode in mako?

I constantly get this error using mako: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 6: ordinal not in range(128) I've told mako I'm using unicode in any possible way: mylookup = TemplateLookup( directories=['plugins/stl/templates'], input_encoding='utf-8', output_encoding='...

How do I insert unicode into MS-SQL ?

I want to insert info.NativeName into a nvarchar field in the database. It doesn't work, all I get is ??????? where the encoding is not western/latin. Outputting listcultures directly in an asp.net website on page_onload worked fine, but it seems not to work via database. Public Sub listcultures() 'Dim x As System.DateTime = DateTi...

Wrong Unicode conversion, how to store accent characters in Delphi 2010 source code and handle character sets?

We are upgrading our project from Delphi 2006 to Delphi 2010. Old code was: InputText: string; InputText := SomeTEditComponent.Text; ... for i := 1 to length(InputText) do if InputText[i] in ['0'..'9', 'a'..'z', 'Ř' { and more special characters } ] then ... Trouble is with accent letters - compare will fail. I tried switch source co...