tags:

views:

71

answers:

2

I saved my script in UTF-8 encoding.

I changed my codepage on windows to 65001.

I'm on python 2.6

Script #1

# -*- coding: utf-8 -*-
print u'Español'
x = raw_input()

Script #2

# -*- coding: utf-8 -*-
a = 'Español'
a.encode('utf8')
print a
x = raw_input()

Script #1, prints the word fine with no errors, Script #2 does error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xf1 in position 4: ordinal not in range(128)

I want to be able to print this variable without errors dynamically as in script #2. the encode('utf8') was mentioned to me as the equivalent of doing u'string'

Obviously, it's not because it throws errors. How can I do it folks?

+4  A: 

For script #2:

a = 'Español'           # In Python2 this is a string of bytes
a = a.decode('utf-8')   # This converts it to a unicode string
print(a)
unutbu
+7  A: 

Change your code to the following:

# -*- coding: utf-8 -*-
a = 'Español'
a = a.decode('utf8')
print a
x = raw_input()

Decode specifies how the string should be read, and returns the value. Making the changes above should fix your problem.

The problem is that python stores a string as a list of bytes, regardless of the encoding of the file. What matters is how those bytes are read, and that is what we are doing when we use decode() and u''.

Stargazer712