First of all, you decode data to Unicode (the absence of encoding) when reading from a file, pipe, socket, terminal, etc.; and encode Unicode to an appropriate byte encoding when sending/persisting data. I suspect this is the root of your problem.
The web service should declare the encoding in the headers or data received. print
normally automatically encodes Unicode to the terminal's encoding (discovered through sys.stdout.encoding
) or in absence of that just ascii
. If the characters in your data are not supported by the target encoding, you'll get a UnicodeEncodeError
.
Since that is not the error you received, you should post some code so we can see what your are doing. Most likely, you are encoding a byte string instead of decoding. Here's an example of this:
>>> data = '\xc2\xbd' # UTF-8 encoded 1/2 symbol.
>>> data.encode('cp437')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\dev\python\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)
What I did here is call encode
on a byte string. Since encode
requires a Unicode string, Python used the default ascii
encoding to decode the byte string to Unicode first, before encoding to cp437
.
Fix this by decoding instead of encoding the data, then print
will encode to stdout automatically. As long as your terminal supports the characters in the data, it will display properly:
>>> import sys
>>> sys.stdout.encoding
'cp437'
>>> print data.decode('utf8') # implicit encode to sys.stdout.encoding
½
>>> print data.decode('utf8').encode('cp437') # explicit encode.
½