views:

150

answers:

3

I was playing around with list comprehensions to get a better understanding of them and I ran into some unexpected output that I am not able to explain. I haven't found this question asked before, but if it /is/ a repeat question, I apologize.

I was essentially trying to write a generator which generated generators. A simple generator that uses list comprehension would look like this:

(x for x in range(10) if x%2==0) # generates all even integers in range(10)

What I was trying to do was write a generator that generated two generators - the first of which generated the even numbers in range(10) and the second of which generated the odd numbers in range(10). For this, I did:

>>> (x for x in range(10) if x%2==i for i in range(2))
<generator object <genexpr> at 0x7f6b90948f00>

>>> for i in g.next(): print i
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
UnboundLocalError: local variable 'i' referenced before assignment
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> g = (x for x in range(10) if x%2==i for i in range(2))
>>> g
<generator object <genexpr> at 0x7f6b90969730>
>>> g.next()

Traceback (most recent call last): File "", line 1, in File "", line 1, in UnboundLocalError: local variable 'i' referenced before assignment

I don't understand why 'i' is being referenced before assignment

I thought it might have had something to do with i in range(2), so I did:

>>> g = (x for x in range(10) if x%2==i for i in [0.1])
>>> g
<generator object <genexpr> at 0x7f6b90948f00>
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
UnboundLocalError: local variable 'i' referenced before assignment

This didn't make sense to me, so I thought it best to try something simpler first. So I went back to lists and tried:

>>> [x for x in range(10) if x%2==i for i in range(2)]
[1, 1, 3, 3, 5, 5, 7, 7, 9, 9]

which I expected to be the same as:

>>> l = []
>>> for i in range(2):
...     for x in range(10):
...             if x%2==i:
...                     l.append(x)
... 
>>> l
[0, 2, 4, 6, 8, 1, 3, 5, 7, 9] # so where is my list comprehension malformed?

But when I tried it on a hunch, this worked:

>>> [[x for x in range(10) if x%2==i] for i in range(2)]
[[0, 2, 4, 6, 8], [1, 3, 5, 7, 9]] # so nested lists in nested list comprehension somehow affect the scope of if statements? :S

So I thought it might be a problem with what level of scope the if statement operates in. So I tried this:

>>> [x for x in range(10) for i in range(2) if x%2==i]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

And now I'm thoroughly confused. Can someone please explain this behavior. I don't understand why my list comprehensions seem to be malformed, nor do I understand how the scoping of the if statements work.

Any help would be greatly appreciated

Thank you

PS: While proof-reading the question, I realized that this does look a bit like a homework question - it is not.

+2  A: 

you need to use some parentheses:

((x for x in range(10) if x%2==i) for i in range(2))

This didn't make sense to me, so I thought it best to try something simpler first. So I went back to lists and tried:

[>>> [x for x in range(10) if x%2==i for i in range(2)] [1, 1, 3, 3, 5, 5, 7, 7, 9, 9]

That worked because a previous list comprehension leaks the i variable to the enclosing scope, and become the i for the current one. Try starting a fresh python interpreter, and that would fail due to NameError. The counter's leaking behavior has been removed in Python 3.

EDIT:

The equivalent for loop for:

(x for x in range(10) if x%2==i for i in range(2))

would be:

l = []
for x in range(10):
    if x%2 == i:
        for i in range(2):
            l.append(x)

which also gives a name error.

EDIT2:

the parenthesed version:

((x for x in range(10) if x%2==i) for i in range(2))

is equivalent to:

li = []
for i in range(2):
    lx = []
    for x in range(10):
        if x%2==i:
            lx.append(x)
    li.append(lx)
Lie Ryan
Thank you, I understand the leak error, but why are the parentheses required? What does the code without parentheses translate into (in terms of a for loop)? I understand that parentheses fix the problem, I just don't get why
inspectorG4dget
@inspectorG4dget: see my updated answer
Lie Ryan
@LieRyan: this makes things so much clearer now, thank you much.
inspectorG4dget
Unfortunately depending on just how you consume the result the first generator can give even numbers, odd numbers, or some even followed by some odd.
Duncan
+1  A: 

Lie has the answer to the syntactical question. A suggestion: don't stuff so much into the body of a generator. A function is much more readable.

def make_generator(modulus):
    return (x for x in range(10) if x % 2 == modulus)
g = (make_generator(i) for i in range(2))
Glenn Maynard
Thank you. I understand this and I do prefer this. But I was trying to get some practice with list comprehensions and see how far I could push them
inspectorG4dget
+1  A: 

Expanding on Lie Ryan's answer a bit:

something = (x for x in range(10) if x%2==i for i in range(2))

is equivalent to:

def _gen1():
   for x in range(10):
       if x%2 == i:
           for i in range(2):
               yield x
something = _gen1()

whereas the parenthesised version is equivalent to:

def _gen1():
   def _gen2():
      for x in range(10):
          if x%2 == i:
              yield x

      for i in range(2):
          yield _gen2()

This does actually yield the two generators:

[<generator object <genexpr> at 0x02A0A968>, <generator object <genexpr> at 0x02A0A990>]

Unfortunately the generators it yields are somewhat unstable as the output will depend on how you consume them:

>>> gens = ((x for x in range(10) if x%2==i) for i in range(2))
>>> for g in gens:
        print(list(g))


[0, 2, 4, 6, 8]
[1, 3, 5, 7, 9]
>>> gens = ((x for x in range(10) if x%2==i) for i in range(2))
>>> for g in list(gens):
        print(list(g))


[1, 3, 5, 7, 9]
[1, 3, 5, 7, 9]

My advice is to write the generator functions out in full: I think trying to get the correct scoping on i without doing that may be all but impossible.

Duncan