I have the following code:
import string
def translate_non_alphanumerics(to_translate, translate_to='_'):
not_letters_or_digits = u'!"#%\'()*+,-./:;<=>?@[\]^_`{|}~'
translate_table = string.maketrans(not_letters_or_digits,
translate_to
*len(not_letters_or_digits))
return to_translate.translate(translate_table)
Which works great for non-unicode strings:
>>> translate_non_alphanumerics('<foo>!')
'_foo__'
But fails for unicode strings:
>>> translate_non_alphanumerics(u'<foo>!')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in translate_non_alphanumerics
TypeError: character mapping must return integer, None or unicode
I can't make any sense of the paragraph on "Unicode objects" in the Python 2.6.2 docs for the str.translate() method.
How do I make this work for Unicode strings?