I have a unicode string in python, and I would like to remove all the accents (diacritics).
I found on the Web an elegant way to do this in Java:
- convert the unicode string to its long normalized form (with a separate character for letters and diacritics)
- remove all the characters whose unicode type is "diacritic".
Do I need to install a library such as pyICU or is this possible with just the python standard library? And what about in python 3.0?
Important note: I would like to avoid code with an explicit mapping from accented characters to their non-accented counterpart.
Thanks for your help.