unicode

Apache htdocs in folder with unicode name

I have my apache (for windows) htdocs in a folder like c:\anything1\怘怙怚怛\anything2. The problem is that in this case php won't execute any scripts from here and will display an error message like this: `Warning: Unknown: failed to open stream: No such file or directory in Unknown on line 0 Fatal error: Unknown: Failed opening required ...

read unicode output of console application

I've console app. written in Delphi 2010. It's output is Unicode supported. (I used UTF8Encode and SetConsoleOutputCP(CP_UTF8) for this). When I run the program from command prompt it works fine. Now I want to read the output from another program which was created in Delphi 5. I use this method. But I've problems with unicode characters....

How to check if the word is Japanese or English using PHP

I want to have different process for English word and Japanese word in this function function process_word($word) { if($word is english) { ///////// }else if($word is japanese) { //////// } } thank you ...

stdout and stderr character encoding

i working on a c++ string library that have main 4 classes that deals with ASCII, UTF8, UTF16, UTF32 strings, every class has Print function that format an input string and print the result to stdout or stderr. my problem is i don't know what is the default character encoding for those streams. for now my classes work in windows, later ...

feedparser fails during script run, but can't reproduce in interactive python console

It's failing with this when I run eclipse or when I run my script in iPython: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128) I don't know why, but when I simply execute the feedparse.parse(url) statement using the same url, there is no error thrown. This is stumping me big time. The code is as simple ...

When uploading Arabic files in Spring, filename ends up with XML entities instead of Arabic glyphs

I am using Spring upload to upload files. When uploading an Arabic file and getting the original file name in the controller, I get something like: المغفلين.png I expect it to be: المغفلين.png Any ideas why this problem occur? ...

Converting datetime.ctime() values to Unicode

I would like to convert datetime.ctime() values to Unicode. Using Python 2.6.4 running under Windows I can set my locale to Spanish like below: >>> import locale >>> locale.setlocale(locale.LC_ALL, 'esp' ) Then I can pass %a, %A, %b, and %B to ctime() to get day and month names and abbreviations. >>> import datetime >>> dateValue ...

problem with reading arabic in jsp page?

I have a column in the PostgreSQL database which contains Arabic data. When reading the data from the database in the controller it's been read fine, the encoding is good, but when sending the data to the JSP page and trying to read it, they appears as something like ?????????. Any ideas why something like this occur? ...

How to copy a string to Clipboard with utf-16 encoding in WPF app

My understanding is that string in .net is utf-16 by default. But when I copy my string to Clipboard, it is changed to sjis encoding (my OS is Japanese). Here is what I am doing: string myStringToCopy = "Some text Here"; System.Windows.DataFormat myDataFormat = DataFormats.GetDataFormat("MyFormat-V1"); Clipboard.SetData(myDataFormat...

Using Python, How to copy files in 'temporary internet files' folder in Windows

I am using this code to find files recursively in a folder , with size greater than 50000 bytes. def listall(parent): lis=[] for root, dirs, files in os.walk(parent): for name in files: if os.path.getsize(os.path.join(root,name))>500000: lis.append(os.path...

Hunspell pack for Unicode

I want to compile a hunspell dictionary for Hindi language. I found the instructions on the following page but could not follow it. http://manpages.ubuntu.com/manpages/dapper/man4/hunspell.4.html Can someone guide me or show me how to write a test hunspell pack for an Indian language? ...

What's the deal with char.GetNumericValue?

I was working on Project Euler 40, and was a bit bothered that there was no int.Parse(char). Not a big deal, but I did some asking around and someone suggested char.GetNumericValue. GetNumericValue seems like a very odd method to me: Takes in a char as a parameter and returns...a double? Returns -1.0 if the char is not '0' through '9...

Convert UTF-16 to UTF-8 under Windows and Linux, in C

I was wondering if there is a recommended 'cross' Windows and Linux method for the purpose of converting strings from UTF-16LE to UTF-8? or one should use different methods for each environment? I've managed to google few references to 'iconv' , but for somreason I can't find samples of basic conversions, such as - converting a wchar_t ...

Why does Python sometimes upgrade a string to unicode and sometimes not?

I'm confused. Consider this code working the way I expect: >>> foo = u'Émilie and Juañ are turncoats.' >>> bar = "foo is %s" % foo >>> bar u'foo is \xc3\x89milie and Jua\xc3\xb1 are turncoats.' And this code not at all working the way I expect: >>> try: ... raise Exception(foo) ... except Exception as e: ... foo2 = e ... >>...

SQL Server 2005 Convert Ascii to Unicode (UTF-8 -> nvarchar)

I have data in an nvarchar field with data in ascii format: "Zard Frères Guesta" How do I convert it to a readable(unicode) format in t-sql? ...

Java, JavaCC: How to parse characters outside the BMP?

Hello, everyone! I am referring to the XML 1.1 spec. Look at the definition of NameStartChar: NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xE...

Paste from Word + Create XML document -> hexadecimal value 0x0C, is an invalid character (.Net)

I have a webpage that accepts HTML-input from users. The input is converted into an xml document using the System.Xml namespace, like this: var doc = new XmlDocument(); doc.AppendChild(doc.CreateElement("root")); doc.DocumentElement.SetAttribute("BodyHTML", theTextBox.Text); Afterwards an Xsl transformation (System.Xml.Xsl.XslCompiled...

Unicode Cookie Value

I am about to start to make a cookie with Unicode value (Japanese characters) is there any problem with Unicode Cookie value? in IE 7 IE 8 Firefox, Safari, Chrome? Thank you ...

PHP-GD: Dealing with Unicode characters

I am developing a web service that renders characters using the PHP GD extension, using a user-selected TTF font. This works fine in ASCII-land, but there are a few problems: The string to be rendered comes in as UTF-8. I would like to limit the list of user-selectable fonts to be only those which can render the string properly, as so...

Other language string in SQL Server 2005

I am trying you insert some string which is not in English (other language). when i fetch back they are not correct. They comes like "?????". But at the same time when I enter the string through the SQL Server UI (SSMS) to enter the string, it works OK. What could be the solution please? ...