views:

57

answers:

4

Is it safe to modify a mutable object returned by a method of a standard library object?

Here's one specific example; but I'm looking for a general answer if possible.

#m is a MatchObject
#I know there's only one named group in the regex
#I want to retrieve the name and the value
g, v = m.groupdict().popitem()
#do something else with m

Is this code safe? I'm concerned that by changing groupdict() I'm corrupting the object m (which I still need for later).

I tested this out, and a subsequent call to m.groupdict() still returned the original dictionary; but for all I know this may be implementation-dependent.

A: 

groupdict returns a new dictionary every time:

In [20]: id(m.groupdict())
Out[20]: 3075475492L

In [21]: id(m.groupdict())
Out[21]: 3075473588L

I cannot speak about whole standard library though. You should check yourself, whether a method returns a reference to some internally stored structure inside the object or creates a new one every time it is called. groupdict creates a new one every time. This is why it should be safe to modify result dictionary.

gruszczy
Thank you, this helps.However, I'm not sure how I would check what the method returns, other than by reading the implementation code. Of course, I don't want to rely on any specific implementation in deciding whether my code is safe.
max
A: 

You should be safe in this particular case, because MatchObject.groupdict() returns a dictionary representing the matched groups, but it is a new copy every time.

dict.popitem() does change the dictionary you are calling it on though.

Remove and return an arbitrary (key, value) pair from the dictionary.

popitem() is useful to destructively iterate over a dictionary, as often used in set algorithms. If the dictionary is empty, calling popitem() raises a KeyError.

quantumSoup
How can I be certain that it is a new copy? I expect it's stated somewhere in Python documentation, but where
max
@max As far as I know all of [`MatchObject`](http://docs.python.org/library/re.html#match-objects)'s methods and properties return values only and thus you can't change the underlying object. I don't think this *has* to documented, since wouldn't make any sense that you could change the object.
quantumSoup
@quantumSoup Correct me if I'm wrong, but I thought in Python you can only return a reference - and a reference can be to an internal object or to a newly created object.
max
@max I meant that you can't change the underlying MatchObject
quantumSoup
@quantumSoup hmm.. if MatchObject happens to expose any of its internal parts to you through a reference, you can modify them. And if MatchObject doesn't expect you to, you may corrupt it by this modification.
max
@max That's why it doesn't expose its internal parts.
quantumSoup
A: 

There are two different operations being performed on m here. The first, groupdict, creates a dictionary from m. The second, popitem, returns an item from the dictionary and modifies the dictionary (but not the underlying item).

So subsequent calls to m.groupdict() still create the dictionary from the same m.

But why are you using popitem at all? Why not just items()?

Andrew Jaffe
But how do I know that groupdict() actually creates a brand new dictionary every time I call it? I am afraid in some implementation it might provide a reference to an internal dictionary object. Is this clarified anywhere in the documentation?Regarding your last point, of course I could easily switch to the non-destructive .items()[0]; I just wanted to give some simple example.
max
+1  A: 

Generally it's only guaranteed to be safe if the documentation says so. (In this particular case it seems very unlikely that another implementation would behave differently though.)

JanC
True. Other implementations *usually* follow what CPython does and CPython *usually* preserves compatibility with its own previous versions.
Constantin