Your definition of "unicode characters" is a bit vague. This is usually used by starters to denote all UTF-8 characters which are NOT covered by the standard ISO 8859 charset. Is this true in your case? If so, then you likely need to loop through every character of the String and test its codepoint if it is covered by the ISO 8859 charset or not.
You can also just have a Map and do the replace in a loop if the map contains the key. For example:
Map<Character, Character> charReplacementMap = new HashMap<Character, Character>() {{
put('Ü', 'Y');
// Put more here.
}};
String originalString = "AÜAÜ";
StringBuilder builder = new StringBuilder();
for (char currentChar : originalString.toCharArray()) {
Character replacementChar = charReplacementMap.get(currentChar);
builder.append(replacementChar != null ? replacementChar : currentChar);
}
String newString = builder.toString();
Or, do you mean "all characters whith diacritical marks" with it? If so, then you can use java.text.Normalizer
to get rid of all diacritical marks:
/**
* Remove any diacritical marks (accents like ç, ñ, é, etc) from
* the given string (so that it returns plain c, n, e, etc).
* @param string The string to remove diacritical marks from.
* @return The string with removed diacritical marks, if any.
*/
public static String removeDiacriticalMarks(String string) {
return Normalizer.normalize(string, Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
One pitfall, Ü would become U, not Y. Not sure if that's what you're after. If you want to replace by pronounced character, you'll really need to create a mapping. Sure, it's a tedious work, but it's done in less time than you needed to follow this topic ;)