views:

184

answers:

4

Why does CPython (no clue about other Python implementations) have the following behavior?

tuple1 = ()
tuple2 = ()                                                                                                   
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))

I have a few ideas why it may, but can't find a concrete reason why.

EDIT

To further prove Glenn's and Thomas' point:

[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
A: 

it doesn't work the same way in Jython...

>>> id({})
1
>>> id([])
2

Could there be an optimization going on where commonly used (i.e. empty) containers are "interned" to save on allocation costs?

This (in CPython) suggests not:

>>> def mutateid(obj):
...   obj.append('x')
...   print obj
...   print id(obj)
... 
>>> mutateid([])
['x']
4299590472
>>> id([])
4299590472
>>> 
phlip
That's for two reasons, really: first of all Jython uses Java's GC, which means objects aren't collected as soon as the last reference goes away (like in CPython). Second of all, because Java objects aren't in fixed memory locations (like in CPython), Jython can't use the object's memory address for its id. It has to use something else while keeping the semantics of `id()`. As I recall, Jython uses a counter that only starts counting when you call `id()` on an object.
Thomas Wouters
That code doesn't really test the same phenomenon that the OP is asking about. What does `id({}) == id({})` return in Jython?
Marcelo Cantos
>>> id({}) == id({}) False>>>
phlip
A: 

The == operator on lists and dicts do not compare the object IDs to see if they the same object - use obj1 is obj2 for that.

Instead the == operator compares the members of the list of dict to see if they are the same.

Dave Kirby
He's not comparing `[] == []`, he's comparing `id([]) == id([])`.
Glenn Maynard
Notice how he's not comparing the lists and dicts, but their `id()`.
Thomas Wouters
The OP isn't trying to do that.
Marcelo Cantos
Actually, most implementations of == checks with `is` operator first before doing member-wise check. Reason being two objects with the same id must have the same content. But you're right that `id()` comparison should be done using `is` operator instead of `id(a) == id(b)`; and that taking the id() of an object is generally meaningless.
Lie Ryan
@Lie Ryan interestingly, that doesn't work in this case: `[] is []` is `False`. I guess passing the first `[]` to `id` creates a scope for it to go out of.
aaronasterling
@AaronMcSmooth: `[] is []` returning False indicates that the two lists are indeed two different objects, which is the expected behavior. The `is` operator is invented to take care of grabbing a reference of the compared objects, so they don't get out of scope before the comparison (which is impossible to guarantee with the semantic of == operator and id()).
Lie Ryan
+6  A: 

CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.

This shows what's happening very clearly (the output is likely to be different in other implementations of Python):

class A(object):
    def __init__(self): print "a",
    def __del__(self): print "b",

# a a b b False
print A() is A()
# a b a b True
print id(A()) == id(A())
Glenn Maynard
Although Thomas' answer is equally correct, you provide concrete reasoning which is what I was looking for.
spenthil
+16  A: 

When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.

Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.

In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.

Thomas Wouters
Is there a shorter way to say "vote-up-the-top-ansewr-and-ignore-the-rest-regardless-of-merit-itis"?
Glenn Maynard
Sort of Glenn, you mark it as the answer :)
spenthil
@spenthil: Mark-the-top-answer-correct-ignoring-the-clearer-answer-below-it-itis? That does, indeed, also happen. :P
Glenn Maynard
Sorry was experimenting before marking (see OP).
spenthil
Although you are correct, I was looking for verifiable reasoning (see @Glenn Maynard)
spenthil
Without making any comment on the relative merits of the answers, I feel that upvotes should be scaled , e.g. worth `some_constant * log(upvoter_rep) / log(answerer_rep)`
John Machin
@Glenn: I'm not sure why my answer has less merit than yours. Sure, it's longer, and it doesn't contain the experimentation you did in yours, but that's because I actually know these things from the CPython source. There's no need for experimentation.
Thomas Wouters
My answer contains no experimentation, it contains a specific example meant to *demonstrate* precisely what's happening. I consider my answer clearer because it explains the entire issue in two concise sentences, rather than several paragraphs. However, to be clear: I don't think this is a bad answer at all. It was just the silliness of this answer being +12 while mine was +1 that I found amusing.
Glenn Maynard