views:

49

answers:

4

Hello there,

is there any simple/lightweight solution to change at least some of non-ASCII symbols to respective ASCII analogs? For example this string

abc-åäö.txt

should be changed to

abc-aao.txt

A bit of background: Zip-tools do not reliably support UTF-8, hence the need to downgrade. AFAICR Google "download attachments as single zip file" feature replaces any non-ascii symbols with the '_' character.

PS: the code might as well be in some other language, if it's more or less understandable I'll port that to Java. PPS: my first question so far, so please don't minus me below the ground okay?

+1  A: 

Maybe this would do?

Krumelur
thanks for the reference, but I don't see the actual code there, apparently this is either already a part of JRE (that java.text.Normalizer or something similar) or not a lightweight solution...
Anton S. Kraievoy
+2  A: 

Have a look at java.text.Normalizer. It can help you with transforming equivalent characters: http://en.wikipedia.org/wiki/Unicode_equivalence

relet
A: 

Okay, found something more or less working in this question: PHP: Replace umlauts with closest 7-bit ASCII aequivalent in an UTF-8 string

Anton S. Kraievoy
+1  A: 

Looks like the problem is solved here -

[solution][howto] Convert special characters to normal chars (é to e) http://www.ramonfincken.com/permalink/topic192.html

d-live