views:

48

answers:

2

I am using Spring upload to upload files. When uploading an Arabic file and getting the original file name in the controller, I get something like:

المغفلين.png

I expect it to be:

المغفلين.png

Any ideas why this problem occur?

A: 

It's likely Spring which has transformed Unicode characters (at least, the non-ISO-8859-1 characters) into XML entities. This behaviour must be configureable somewhere in the Spring settings (or those of the web based MVC framework you're actually using in combination with Spring but didn't mention about). Since I don't do Spring, I can't go in detail about configuring this.

But if you can't figure it for ever, then you may consider to use Apache Commons Lang StringEscapeUtils#unescapeXml() to manually unescape the XML entities into real Arabic glyphs.

String realFilename = StringEscapeUtils.unescapeXml(escapedFilename);
BalusC
well problem is solved now, i can get the name correctly and insert it into the database, but rises up another problem when trying to send that arabic name from the databse to the form it appears something like ???????????
sword101
You're welcome. About the new problem, I see that you've already posted another question about this. I've answered it. Please finalize this question by voting the helpful answers and accepting the answer which helped most in solving the problem. Also see http://stackoverflow.com/faq to learn how to use Stackoverflow properly and to keep its spirit :)
BalusC
A: 

There is nothing wrong with that encoding. It means exactly the same as the name you gave it.

According to the XML standard character references can be in the form #&n; where n is a decimal ([0-9]+) or hexademical (x[0-9a-fA-F]+) number, referring to the Unicode code point of the character represented. Thus the file name in your question is valid XML.

In your case the first character ا (equivalent to &#x0627) represents the Unicode symbol with decimal code point 1575, usually represented in hexidecimal as U+0627. This code point is described as the Arabic letter "alef".

The symbols are encoded from left-to-right even though it is Arabic (right-to-left) symbols being encoded, so the "alef" is on the left of the ASCII file name. It is up to the rendering engine (whatever that might be) to render the string as RTL.

My Java experience is very limited, so unfortunately I cannot point you at a built-in or Spring feature that will help you handle this, but it seems to be that your XML is not properly decoded (if I had to guess).

Walter