I tried to do that, and I found this errors:
>>> import re
>>> x = 'Ingl\xeas'
>>> x
'Ingl\xeas'
>>> print x
Ingl�s
>>> x.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 4-5: unexpected end of data
>>> x.decode('utf8', 'ignore')
u'Ingl'
>>> x.decode('utf8', 'replace')
u'Ingl\ufffd'
>>> print x.decode('utf8', 'replace')
Ingl�
>>> print x.decode('utf8', 'xmlcharrefreplace')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
TypeError: don't know how to handle UnicodeDecodeError in error callback
When I use the print statement, I want that:
>>> print x
u'Inglês'
Any help is welcome.