but what I want to see printed out is, surprise:
[('亀',), ('犬',)]
What do you want to see it printed out on? Because if it's the console, it's not at all guaranteed your console can display those characters. This is why Python's ‘repr()’ representation of objects goes for the safe option of \-escapes, which you will always be able to see on-screen and type in easily.
As a prerequisite you should be using Unicode strings (u''). And, as mentioned by Matthew, if you want to be able to write u'亀' directly in source you need to make sure Python can read the file's encoding. For occasional use of non-ASCII characters it is best to stick with the escaped version u'\u4e80', but when you have a lot of East Asian text you want to be able to read, “# coding=utf-8” is definitely the way to go.
print '[%s]' % ', '.join([', '.join('(%s,)' % ', '.join(ti) for ti in t)])
That would print the characters unwrapped by quotes. Really you'd want:
def reprunicode(u):
return repr(u).decode('raw_unicode_escape')
print u'[%s]' % u', '.join([u'(%s,)' % reprunicode(ti[0]) for ti in t])
This would work, but if the console didn't support Unicode (and this is especially troublesome on Windows), you'll get a big old UnicodeError.
In any case, this rarely matters because the repr() of an object, which is what you're seeing here, doesn't usually make it to the public user interface of an application; it's really for the coder only.
However, you'll be pleased to know that Python 3.0 behaves exactly as you want:
- plain '' strings without the ‘u’ prefix are now Unicode strings
- repr() shows most Unicode characters verbatim
- Unicode in the Windows console is better supported (you can still get UnicodeError on Unix if your environment isn't UTF-8)
Python 3.0 is a little bit new and not so well-supported by libraries, but it might well suit your needs better.