views:

167

answers:

2

Hey there, ran into the following. Dont need it clarified but find it interesting, though. Consider:

>>> class A:
...     def __str__(self):
...             return "some A()"
... 
>>> class B(A):
...     def __str__(self):
...             return "some B()"
... 
>>> print A()
some A()
>>> print B()
some B()
>>> A.__str__ == B.__str__
False # seems reasonable, since each method is an object
>>> id(A.__str__)==id(B.__str__)
True # what?!

What's going on here? Cheers.

+1  A: 

The following works:

>>> id(A.__str__.im_func) == id(A.__str__.im_func)
True
>>> id(B.__str__.im_func) == id(A.__str__.im_func)
False
honzas
+5  A: 

As the string id(A.__str__) == id(B.__str__) is evaluated, A.__str__ is created, its id taken, and then garbage collected. Then B.__str__ is created, and happens to end up at the exact same address that A.__str__ was at earlier, so it gets (in CPython) the same id.

Try assigning A.__str__ and B.__str__ to temporary variables and you'll see something different:

>>> f = A.__str__
>>> g = B.__str__
>>> id(f) == id(g)
False

For a simpler example of this phenomenon, try:

>>> id(float('3.0')) == id(float('4.0'))
True
Mark Dickinson
But then, why>>> f = A.__str__>>> id(f) == id(A.__str__)False
Krab
`A.__str__` is created??? Not sure about this: it must be part of the metaclass that spawns all classes i.e. their basic "DNA".
jldupont
@jldupont: Python creates the unbound methods `A.__str__` and `B.__str__` at runtime. http://users.rcn.com/python/download/Descriptor.htm is a good reference for the underlying mechanisms.
Mark Dickinson
@Krab: There you've got two copies of `A.__str__` existing at the same time (apparently Python doesn't cache previously-created unbound methods in any way). Any two distinct objects that exist simultaneously must have different ids.
Mark Dickinson
hmmm... learning new things about Python every day! Thanks! Not sure what are the implications of this behavior when it comes to using `id` for hashing objects though....
jldupont
hmmm... can those be garbage collected that quick? I am still unconvinced about this explanation.
jldupont
@jldupont, CPython is refcounted, so most things are garbage collected immediately. This is the correct explanation of what happens.
Mike Graham
@jldupont: It's easy to test the theory! Try creating a subclass `myfloat` of float, and overriding `__new__` and `__del__` so that they log their calls appropriately. Now watch the sequence of operations when you evaluate `id(myfloat(1.0)) == id(myfloat(2.0))`.(Not that a call to `__del__` necessarily corresponds directly to garbage collection, though.)
Mark Dickinson
@jldupont, Incidentally, you wouldn't usually use `id` to define a hash yourself. It's used for default for instances of classes you make, which is reliable and does not have descriptor subtlety. For a method (which is created on-the-fly), the hash depends on the underlying function (which is constant); there are not often times when this would make choosing a proper basis for a hash difficult.
Mike Graham
@Mike Graham: tested the garbage collection behavior as hinted: I am convinced now. Thanks.
jldupont
+1: for the entertaining discussion and pointing me to new learning material. Cheers.
jldupont