How do I convert Æ
and á
into a regular English char with Java ? What I have is something like this : Local TV from Paraná
. How to convert it to [Parana] ?
views:
290answers:
2
A:
As far as I know, there's no way to do this automatically -- you'd have to substitute manually using String.replaceAll.
String str = "Paraná";
str = str.replaceAll("á", "a");
str = str.replaceAll("Æ", "a");
Kaleb Brasee
2009-12-26 17:58:08
+2
A:
Look at icu4j or the JDK 1.6 Normalizer:
public String removeAccents(String text) {
return NNormalizer.normalize(text, Normalizer.Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
bmargulies
2009-12-26 18:11:35
You probably meant "Normalizer.normalize(text, Normalizer.Form.NFD)" instead of "Normalizer.decompose(text, false, 0)"
Steve Emmerson
2009-12-26 18:58:41
I think I accidentally put in the old sun. class scheme instead. Thanks for catching it.
bmargulies
2009-12-26 19:42:32
Normalizer.Form.NFKD may be better than Normalizer.Form.NFD for his purposes, depending on how he wants to treat ligatures. eg: NFKD will transform `"fi"` into `"fi"`.
Laurence Gonsalves
2009-12-26 21:34:59