views:

268

answers:

4

I have some python code that's throwing a KeyError exception. So far I haven't been able to reproduce outside of the operating environment, so I can't post a reduced test case here.

The code that's raising the exception is iterating through a loop like this:

for k in d.keys():
    if condition:
        del d[k]

The del[k] line throws the exception. I've added a try/except clause around it and have been able to determine that k in d is False, but k in d.keys() is True.

The keys of d are bound methods of old-style class instances.

The class implements __cmp__ and __hash__, so that's where I've been focusing my attention.

+4  A: 

Don't delete items in d while iterating over it, store the keys you want to delete in a list and delete them in another loop:

deleted = []
for k in d.keys():
    if condition:
        deleted.append(k)
for k in deleted:
    del d[k]
Luper Rouch
he's not iterating over `d`, he's iterating over `d.keys()` list
SilentGhost
@SilentGhost if the keys are lazy loaded (which I suspect they are but cannot confirm at the moment) then it's effectively the same thing in this context.
Davy8
@what do you mean lazy loaded? it's a list.
SilentGhost
SilentGhost is completely right, `d.keys()` already gives a new list of items, so there is no point to copy it again and it won't make a difference.
THC4k
@SilentGhost you're right, unless he is on python 3.x where dict.keys() returns an iterator. Also maybe he could be modifying `d` when testing `condition` ?
Luper Rouch
I see a tag of Python-2.x
surely, old-style classes mentioned in a question and the python-2.x tag indicate that he is not on py3k. In py3k the sample code would raise a `RuntimeError`
SilentGhost
I am on python 2.6.2
Chris AtLee
This also doesn't fix the problem
Chris AtLee
@Chris I suspect it's the hash function -- could you add that to your question above?
+15  A: 

k in d.keys() will test equality iteratively for each key, while k in d uses __hash__, so your __hash__ may be broken (i.e. it returns different hashes for objects that compare equal).

adw
`__hash__` is buildbot's buildbot.util.ComparableMixin.__hash__, which looks at an instance's compare_attrs to generate a hash value. These values were changing over time, so the object's hash wasn't stable for its lifetime.
Chris AtLee
Yup, that's another way to break `__hash__`. But then it doesn't even make sense to use the object as a dictionary key.
adw
Yeah, it's not my code that's putting the object as the dictionary key (un)fortunately.
Chris AtLee
A: 

What you're doing would throw a a concurrent modification exception in Java. d.keys() creates a list of the keys as they exist when you call it, but that list is now static - modifications to d will not change a stored version of d.keys(). So when you iterate over d.keys() but delete items, you end up with the possibility of modifying a key that is no longer there.

You can use d.pop(k, None), which will return either the value mapped to k or None, if k is not present. This avoids the KeyError problem.

EDIT: For clarification, to prevent more phantom downmods (no problem with negative feedback, just make it constructive and leave a comment so we can have a potentially informative discussion - I'm here to learn as well as help):

It's true that in this particular condition it shouldn't get messed up. I was just bringing it up as a potential issue, because if he's using the same kind of coding scheme in another portion of the program where he isn't so careful/lucky about how he's treating the data structure, such problems could arise. He isn't even using a dictionary, as well, but rather a class that implements certain methods so you can treat it in a similar fashion.

nearlymonolith
No, that's not true -- if you only ever `del d[k]` for `k` in `d.keys()`, you'll never delete a key twice, since keys are unique.
katrielalex
Assuming everything is as he has it above, keys are unique in a dictionary, and he only deletes keys once per iteration so he's not deleting keys that he's already deleted.
Well yes, that's true that in this particular condition it shouldn't get messed up. I was just bringing it up as a potential issue. He isn't even using a dictionary, as well, but rather a class that implements certain methods.
nearlymonolith
+3  A: 

Simple example of what's broken, for interest:

>>> count = 0
>>> class BrokenHash(object):
...     def __hash__(self):
...             global count
...             count += 1
...             return count
...
...     def __eq__(self, other):
...             return True
...
>>> foo = BrokenHash()
>>> bar = BrokenHash()
>>> foo is bar
False
>>> foo == bar
True
>>> baz = {bar:1}
>>> foo in baz
False
>>> foo in baz.keys()
True
katrielalex
This may be the most pathological Python I have ever written...
katrielalex