As far as I can tell, there is a significant problem with some of the answers posted so far: unicode()
decodes from the default encoding, which is often ASCII. Thus, the following code, which is essentially what is recommended by previous answers, fails on my machine:
# -*- coding: utf-8 -*-
author = 'éric'
print '{0}'.format(unicode(author))
gives:
Traceback (most recent call last):
File "test.py", line 3, in <module>
print '{0}'.format(unicode(author))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
The failure comes from the fact that author
does not contain only ASCII bytes (i.e. with values in [0; 127]), and unicode()
decodes from ASCII by default (on many machines).
A robust solution is to explicitly give the encoding used in your fields; taking UTF-8 as an example:
u'{0} in {1}'.format(unicode(self.author, 'utf-8'), unicode(self.publication, 'utf-8'))
(or without the initial u
, depending on whether you want a Unicode result or a byte string).
At this point, one might want to consider having the author
and publication
fields be Unicode strings, instead of decoding them during formatting.