views:

49

answers:

3
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 38: ordinal not in range(128)

I am downloading a website and then printing its contents...simple. Do I have to encode it somehow?

A: 

Make sure you have Putty configured to accept UTF-8 encoded data.

bastianneu
Hi, it is currently in UTF-8, but still does not work.
TIMEX
+1  A: 

Try utf-8 for start. Website you download might have different charset than ANSI and those extra characters can not be printed on console.

So in place where you do print text do print text.encode('utf-8') instead.

Abgan
A: 

printing stuff to standard output can be problematic, because Python often doesn't know what character encoding the system is using. In the face of this Python 2 assumes the most conservative choice, US ASCII. So when you try to print a string that contains characters that aren't in ASCII, like the U+2019 smart quote , it gives you this error.

In Python 3 the default charset guess for sys.stdout.encoding is UTF-8. If you are sure that your standard output (ie. PuTTY in your case) should accept UTF-8, then yes you can encode it explicitly:

print content.encode('UTF-8')
bobince