ansaurus

Question

UnicodeEncodeError on MySQL insert in Python

Answer 1

+2 A:

you want text.encode('utf8')

Art Gillespie 2009-11-14 00:36:28

yes, i tried this but it also gave the same error.

jack 2009-11-14 00:55:51

ok, it works now. thanks art.

jack 2009-11-14 01:02:10

Answer 2

A:

>>> print text
u'Waldenstr\xf6m'

There is a difference between displaying something in the shell (which uses the repr) and printing it (which just spits out the string):

>>> u'Waldenstr\xf6m'
u'Waldenstr\xf6m'

>>> print u'Waldenstr\xf6m'
Waldenström

So, I'm not sure your snippet above is really what happened. If it definitely is, then your XHTML must contain exactly that string:

<div class="something">u'Waldenstr\xf6m'</div>

(maybe it was incorrectly generated by Python using a string's repr() instead of its str()?)

If this is right and intentional, you would need to parse that Python string literal into a simple string. One way of doing that would be:

>>> r= r"u'Waldenstr\xf6m'"
>>> print r[2:-1].decode('unicode-escape')
Waldenström

If the snippet at the top is actually not quite right and you are simply asking why Python's repr escapes all non-ASCII characters, the answer is that printing non-ASCII to the console is unreliable across various environments so the escape is safer. In the above examples you might have received ?s or worse instead of the ö if you were unlucky.

In Python 3 this changes:

>>> 'Waldenstr\xf6m'
'Waldenström'

bobince 2009-11-14 01:02:33

ansaurus

tags:

views:

answers:

UnicodeEncodeError on MySQL insert in Python

related questions