views:

1059

answers:

3

Is there a python convention for when you should implement __str__() versus __unicode__(). I've seen classes override __unicode__() more frequently than __str__() but it doesn't appear to be consistent. Are there specific rules when it is better to implement one versus the other? Is it necessary/good practice to implement both?

A: 

With the world getting smaller, chances are that any string you encounter will contain Unicode eventually. So for any new apps, you should at least provide __unicode__(). Whether you also override __str__() is then just a matter of taste.

Aaron Digulla
+2  A: 

If I didn't especially care about micro-optimizing stringification for a given class I'd always implement __unicode__ only, as it's more general. When I do care about such minute performance issues (which is the exception, not the rule), having __str__ only (when I can prove there never will be non-ASCII characters in the stringified output) or both (when both are possible), might help.

These I think are solid principles, but in practice it's very common to KNOW there will be nothing but ASCII characters without doing effort to prove it (e.g. the stringified form only has digits, punctuation, and maybe a short ASCII name;-) in which case it's quite typical to move on directly to the "just __str__" approach (but if a programming team I worked with proposed a local guideline to avoid that, I'd be +1 on the proposal, as it's easy to err in these matters AND "premature optimization is the root of all evil in programming";-).

Alex Martelli
+14  A: 

__str__() is the old method -- it returns bytes. __unicode__() is the new, preferred method -- it returns characters. The names are a bit confusing, but in 2.x we're stuck with them for compatibility reasons. Generally, you should put all your string formatting in __unicode__(), and create a stub __str__() method:

def __str__(self):
    return unicode(self).encode('utf-8')

In 3.0, str contains characters, so the same methods are named __bytes__() and __str__(). These behave as expected.

John Millikin
sa you mean creating both __unicode__ and __str__ methods or just keep strings in _(u"") and create __string__ (without the unicode method) ?
muntu
Great answer...thanks a lot. +1.
jkp