I'm a Python beginner, and I have a utf-8 problem.
I have a utf-8 string and I would like to replace all german umlauts with ASCII replacements (in German, u-umlaut 'ü' may be rewritten as 'ue').
u-umlaut has unicode code point 252, so I tried this:
>>> str = unichr(252) + 'ber'
>>> print repr(str)
u'\xfcber'
>>> print repr(str).replace(unichr(252), 'ue')
u'\xfcber'
I expected the last string to be u'ueber'
.
What I ultimately want to do is replace all u-umlauts in a file with 'ue':
import sys
import codecs
f = codecs.open(sys.argv[1],encoding='utf-8')
for line in f:
print repr(line).replace(unichr(252), 'ue')
Thanks for your help! (I'm using Python 2.3.)