views:

59

answers:

3

I wanted to know if is safe ( documented behaviour? ) to delete the domain space of an iterator in execution in Python.

Consider the code:

import os
import sys

sampleSpace = [ x*x for x in range( 7 ) ]

print sampleSpace

for dx in sampleSpace:

    print str( dx )

    if dx == 1:

        del sampleSpace[ 1 ]
        del sampleSpace[ 3 ]

    elif dx == 25:

        del sampleSpace[ -1 ]

print sampleSpace

'sampleSpace' is what I call 'the domain space of an iterator' ( if there is a more appropriate word/phrase, lemme know ).

What I am doing is deleting values from it while the iterator 'dx' is running through it.

Here is what I expect from the code :

Iteration versus element being pointed to (*):

0: [*0, 1, 4, 9, 16, 25, 36]
1: [0, *1, 4, 9, 16, 25, 36] ( delete 2nd and 5th element after this iteration )
2: [0, 4, *9, 25, 36]
3: [0, 4, 9, *25, 36] ( delete -1th element after this iteration )
4: [0, 4, 9, 25*] ( as the iterator points to nothing/end of list, the loop terminates )

.. and here is what I get:

[0, 1, 4, 9, 16, 25, 36]
0
1
9
25
[0, 4, 9, 25]

As you can see - what I expect is what I get - which is contrary to the behaviour I have had from other languages in such a scenario.

Hence - I wanted to ask you if there is some rule like "the iterator becomes invalid if you mutate its space during iteration" in Python?

Is it safe ( documented behaviour? ) in Python to do stuff like this?

+1  A: 

What do you mean by safe? Your code happens not to raise any errors, but it is a distinct possibility of course, consider this:

>>> a = range(3)
>>> for i in a:
    del a


Traceback (most recent call last):
  File "<pyshell#13>", line 2, in <module>
    del a
NameError: name 'a' is not defined
>>> a
[0, 1, 2]
>>> for i in a:
    del a[i+1]


Traceback (most recent call last):
  File "<pyshell#27>", line 2, in <module>
    del a[i+1]
IndexError: list assignment index out of range

It is not clear why would you want to do this, but there is no additional rules applicable to iterators. They're acting exactly as any other type would.

SilentGhost
I would need to operate on a list of files in a directory. Now it might happen that my operations, as a side effect, end up operating on other files besides the one explicitly passed. In that case, I do not want to operate again on the very same files just because they are in the same list. Should I raise this as a separate question if it reads too fuzzy?
PoorLuzer
@PoorL: why would you mutate iterator in a first place?
SilentGhost
Here is my situation : Say I find 3 text files in the current directory a,b,c,d. I want to make backups of all text files. Hence I store these filenames into the 'master list' In my function that handles the backup, I realize that b and c are actually copies of a - hence I really do not need to back them up. Instead of adding code to explicitly skip over such files, I would like to delete the entries from the 'master list'. This brings up the scenario that I posted. How else can I handle this without adding explicit checks for skipping entries?
PoorLuzer
@PoorL: there are many ways to do this. According to the Zen of Python: "explicit is better than implicit". It might not be such a bad idea to just add simple checks. There is another way, of course, create a class that would represent a file, and do the whole comparison thing through `set`s, I don't think that any advice here would be helpful, without seeing your actual task at hand first.
SilentGhost
.. good advice. A new question coming up in a few days.
PoorLuzer
+5  A: 

From the Python tutorial:

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy. The slice notation makes this particularly convenient:

>>> for x in a[:]: # make a slice copy of the entire list
...    if len(x) > 6: a.insert(0, x)
...
>>> a
['defenestrate', 'cat', 'window', 'defenestrate']
Bastien Léonard
Aha! This is what I was looking for. After reading this piece of text, I would, in my sense of the word, say the operation is *not safe*
PoorLuzer
Marking yours as the answer as it directly answered my question. SilentGhost's is more detailed and rich in examples though.
PoorLuzer
Because the behaviour is not defined by the Python language, the outcome is free to change between different implementations, or even different versions of the same implementation
gnibbler
A: 

Generally speaking no, it's not safe and you may get unpredictable behaviour. Iterators aren't required to behave in an specific way under these circumstances.

What's happening in your example is

# list is [0, 1, 4, 9, 16, 25, 36]

if dx == 1:
    # we're at index 1 when this is true
    del sampleSpace[ 1 ]
    # we've removed the item at index 1, and the iterator will move to the next valid position - still index 1, but in a mutated list. We got lucky in this case
    # the list now contains [0, 4, 9, 16, 25, 36]
    del sampleSpace[ 3 ]   
    # we remove the item at index 3 which is (now) value 16
    # the list now contains [0, 4, 9, 25, 36]
elif dx == 25:

    del sampleSpace[ -1 ]
    # we remove the final item, list now looks like
    # the list now contains [0, 4, 9, 25]
pycruft