ansaurus

Question

Answer 1

+2 A:

The str(para) builtin is trying to use the default (ascii) encoding for the unicode in para. This is done before the encode() call:

>>> s=u'123\u2019'
>>> str(s)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 3: ordinal not in range(128)
>>> s.encode("utf-8")
'123\xe2\x80\x99'
>>>

Try encoding para directly, maybe by applying encode("utf-8") to each list element.

gimel 2010-04-13 05:13:11

ansaurus

tags:

views:

answers:

Beautiful Soup Unicode encode error

related questions