While the answers so fare are all correct, I thought I'd provide a more complete treatment:
The simplest way to represent a non-ASCII character in a script literal is to use the u prefix and u or U escapes, like so:
print u"Look \u0411\u043e\u0440\u0438\u0441, a G-clef: \U0001d11e"
This illustrates:
- using the u prefix to make sure the string is a
unicode
object
- using the u escape for characters in the basic multi-lingual plane (U+FFFD and below)
- using the U escape for characters in other planes (U+10000 and above)
- that Ƃ (U+0182 LATIN CAPITAL LETTER B WITH TOPBAR) and Б (U+0411 CYRILLIC CAPTIAL LETTER BE) just one example of many confusingly similar Unicode codepoints
The default script encoding for Python that works everywhere is ASCII. As such, you'd have to use the above escapes to encode literals of non-ASCII characters. You can inform the Python interpreter of the encoding of your script with a line like:
# -*- coding: utf-8 -*-
This only changes the encoding of your script. But then you could write:
print u"Look Борис, a G-clef: "
Note that you still have to use the u prefix to obtain a unicode
object, not a str
object.
Lastly, it is possible to change the default encoding used for str
... but this not recommended, as it is a global change and may not play well with other python code.