unicode

i18n shell in windows

Is there an i18n shell in windows that supports a large character set? Testing my application in windows results in some math characters not being rendered correctly. The Lucida font in cmd.exe and powershell do not have a wide enough selection. Unicode UTF-8 would be the most preferable, followed by the other Unicode encodings. ...

Unicode to UTF8 for CSV Files - Python via xlrd

I'm trying to translate an Excel spreadsheet to CSV using the Python xlrd and csv modules, but am getting hung up on encoding issues. Xlrd produces output from Excel in Unicode, and the CSV module requires UTF-8. I imaging that this has nothing to do with the xlrd module: everything works fine outputing to stdout or other outputs that d...

toLowerCase with special/unicode characters throws exception

correct me if I'm wrong. If str has a character such as "•" in it then running: str.toLowerCase(Locale.English); throws a null pointer exception. That's the behavior I'm seeing. So what's the deal here? What's going on? It isn't specified that toLowerCase throws a null pointer exception. Is there an easy way to get around this? I n...

How to save unicode data to oracle?

I am trying to save unicode data (greek) in oracle database (10 g). I have created a simple table: I understand that NVARCHAR2 always uses UTF-16 encoding so it must be fine for all (human) languages. Then I am trying to insert a string in database. I have hardcoded the string ("How are you?" in Greek) in code. Then I try to get it b...

How can I check whether a byte array contains a unicode string in Java

Given a byte array that is either a UTF-8 encoded string or arbitrary binary data, what approaches can be used in Java to determine which it is? The array may be generated by code similar to: byte[] utf8 = "Hello World".getBytes("UTF-8"); Alternatively it may have been generated by code similar to: byte[] messageContent = new byte[2...

Arabic characters not accepted in Oracle DB

his this is database admin I have a Jboss server with oracle database When I enter the Arabic fonts throung the application the database is not accepting throungh the application ; the arabic charters are also defined ; the operating system is Linux with oracle 10g and Jboss server with j2ee application ...

Free program to grep unicode text files in Windows?

I have a collection of unicode text files (exported from regedit) and I'd like to pull out all the lines with a certain text on them. I've tried Grep for Windows and findstr but both can't seem to handle the unicode encoding. My results are empty, but when I use the -v option (show non-matching lines), the output shows a NUL between ea...

C# read unicode?

Hi. I am receiving a unicode message via the network, which looks like: 74 00 65 00 73 00 74 00 3F 00 I am using a BinaryReader to read the stream from my socket, but the problem is that it doesn't offer a "ReadWideString" function, or something similar to it. Anyone an idea how to deal with this? Thanks! ...

tell visual studio to create a utf-8 environment when running executables

I am using CMAKE with CTEST to run my regressions. My application is a console app which outputs in whatever encoding it is presented by it's environment (A feature of Tcl). How do I tell visual studio that when it runs my application to run it in a utf-8 environment. Right now my regression results are encoded in latin, and it makes ...

Storing and displaying unicode string (हिन्दी) using PHP and MySQL

I have to store hindi text in a MySQL database, fetch it using a PHP script and display it on a webpage. I did the following: I created a database and set its encoding to UTF-8 and also the collation to utf8_bin. I added a varchar field in the table and set it to accept UTF-8 text in the charset property. Then I set about adding data ...

Japanese characters in a latex \section{} cause an error.

I am working on getting Japanese documents created with latex. I have installed the latest version of texlive-2008 which includes CJK. In my document I have the following: \documentclass{class} \usepackage{CJK} \begin{document} \begin{CJK*}{UTF8}{min} \title{[Japanese Characters here 1]} \maketitle \section{[Japanese Characters here 2]...

Why does anyone use an encoding other than UTF-8?

I want to know why any developer would need to use an encoding other than UTF-8. ...

delphi 2009 unicode + ansi problem

Hello folks, I'm porting an isapi (pageproducers) application from delphi 7 to delphi 2009, the pages are based on html files in UTF8. Everything goes well except when Onhtmltag is fired and I replace a transparent tag with any value with special characters like accented characters (áé...) Those characters are replaced in the output wi...

Reading a UTF-8 Unicode file through non-unicode code.

I have to read a text file which is Unicode with UTF-8 encoding and have to write this data to another text file. The file has tab-separated data in lines. My reading code is C++ code without unicode support. What I am doing is reading the file line-by-line in a string/char* and putting that string as-is to the destination file. I can'...

how to print the unicode characters in hexadecimal codes in c++

I am reading the string of data from the oracle database that may or may not contain the Unicode characters into a c++ program.Is there any way for checking the string extracted from the database contains an Unicode characters(UTF-8).if any Unicode characters are present they should be converted into hexadecimal format and need to displ...

Convert Unicode to String in Python (containing extra symbols)

How to you convert a Unicode string (containing extra characters like £ $, etc) into a python string? ...

Decoding HTML entities with Python

I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out what I am doing wrong. Take for example: "U.S. Adviser’s Blunt Memo on Iraq: Time ‘to Go Home’" I've tried BeautifulSoup, decode('iso-8859-1'), and django.utils.encoding's smart_str without any success. ...

XSLT Transform of Unicode source

In my application I am using the 4Suite.org XSLT library to perform transformations of source XML. The syntax is like this: from Ft.Xml.Xslt import Transform transformed_xml = Transform(raw_xml, stylesheet) where raw_xml and stylesheet have been defined elsewhere in my application. raw_xml will be the xml resulting from reading a fi...

.net non ascii username passwords

Hi, in my web site (c# & sql server) i am trying to enable non ascii username and passwords, (username and password columns are set to NvarChar ) what would be the best aproach to achive this? ...

Dropping the Unicode markers in Html output

I have a python list which holds a few email ids accepted as unicode strings: [u'[email protected]',u'[email protected]',u'[email protected]'] This is assigned to values['Emails'] and values is passed to render as html. The Html renders as this: Emails: [u'[email protected]',u'[email protected]',u'[email protected]'] I would like it to ...