ansaurus

Question

Answer 1

+4 A:

This error is probably actually happening when you try to print the representation of the BeautifulSoup file, which will happen automatically if, as I suspect, you are working in the interactive console.

# This code will work fine, note we are assigning the result 
# of the BeautifulSoup object to prevent it from printing immediately.
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(u'\xa0')

# This will probably show the error you saw
print soup

# And this would probably be fine
print soup.encode('utf-8')

Triptych 2009-08-24 05:58:34

This is correct. The error was encountered when trying to debug and printing the content to the screen. It is unfortunate that UTF-8 issues make debugging such a challenge, but the code does work correctly as long as I do not print.

Ryan Rosario 2009-08-24 17:52:29

@Ryan - Trust me - I've been there. Glad this helped.

Triptych 2009-08-24 17:54:24

ansaurus

tags:

views:

answers:

urlopen, BeautifulSoup and UTF-8 Issue

related questions