This is in python 2.4. Here is my situation. I pull a string from a database, and it contains an umlauted 'o' (\xf6). At this point if I run type(value) it returns str. I then attempt to run .decode('utf-8'), and I get an error ('utf8' codec can't decode bytes in position 1-4).
Really my goal here is just to successfully make type(value) return unicode. I found an earlier question that had some useful information, but the example from the picked answer doesn't seem to run for me. Is there something I am doing wrong here?
Here is some code to reproduce:
Name = 'w\xc3\xb6rner'.decode('utf-8')
file.write('Name: %s - %s\n' %(Name, type(Name)))
I never actually get to the write statement, because it fails on the first statement.
Thank you for your help.
Edit:
I verified that the DB's charset is utf8. So in my code to reproduce I changed '\xf6' to '\xc3\xb6', and the failure still occurs. Is there a difference between 'utf-8' and 'utf8'?
The tip on using codecs to write to a file is handy (I'll definitely use it), but in this scenario I am only writing to a log file for debugging purposes.