views:

278

answers:

4

How can an iterator over a non-empty sequence, with no filtering and no aggregation (sum(), etc.), yield nothing?

Consider a simple example:

sequence = ['a', 'b', 'c']
list((el, ord(el)) for el in sequence)

This yields [('a', 97), ('b', 98), ('c', 99)] as expected.

Now, just swap the ord(el) out for an expression that takes the first value out of some generator using (...).next() — forgive the contrived example:

def odd_integers_up_to_length(str):
    return (x for x in xrange(len(str)) if x%2==1)

list((el, odd_integers_up_to_length(el).next()) for el in sequence)

This yields []. Yeah, empty list. No ('a', stuff) tuples. Nothing.

But we're not filtering or aggregating or reducing. A generator expression over n objects without filtering or aggregation must yield n objects, right? What's going on?

+12  A: 

odd_integers_up_to_length(el).next() will raise StopIteration, which isn't caught there, but is caught for the generator expression within it, stopping it without ever yielding anything.

look at the first iteration, when the value is 'a':

>>> odd_integers_up_to_length('a').next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
Devin Jeanpierre
Oops, I should have stated that I was just posting an observation in the form of a question — sorry I took your time needlessly!
Gunnlaugur Briem
+2  A: 

What happens is that the next() call raises a StopIteration exception, which bubbles up the stack to the outer generator expression and stops that iteration.

A StopIteration is the normal way for an iterator to signal that it's done. Generally we don't see it, because generally the next() call occurs within a construct that consumes the iterator, e.g. for x in iterator or sum(iterator). But when we call next() directly, we are the ones responsible for catching the StopIteration. Not doing so springs a leak in the abstraction, which here leads to unexpected behavior in the outer iteration.

The lesson, I suppose: be careful about direct calls to next().

Gunnlaugur Briem
A: 

str is a reserved keword, you should name your variable differently

I was also to advise about the next

Martin
True :), this was just thrown together for the example here.
Gunnlaugur Briem
Terminology nitpick: not a reserved keyword (or it wouldn't compile) - it's a builtin, which it is good practice not to shadow.
Brian
A: 
>>> seq=['a','b','c']
>>> list((el,4) for el in seq)
[('a',4), ('b',4), ('c',4)]

So it's not list giving you trouble here...

dwc