views:

547

answers:

5

I have generator object returned by multiple yield. Preparation to call this generator is rather time-consuming operation. That is why I want to reuse generator several times.

y = FunctionWithYield()
for x in y: print(x)
#here must be something to reset 'y'
for x in y: print(x)

Of course, I'm taking in mind copying content into simple list.

+1  A: 

If GrzegorzOledzki's answer won't suffice, you could probably use send() to accomplish your goal. See PEP-0342 for more details on enhanced generators and yield expressions.

UPDATE: Also see itertools.tee(). It involves some of that memory vs. processing tradeoff mentioned above, but it might save some memory over just storing the generator results in a list; it depends on how you're using the generator.

Hank Gay
+6  A: 

Generators can't be rewound. You have the following options:

  1. Run the generator function again, restarting the generation:

    y = FunctionWithYield()
    for x in y: print(x)
    y = FunctionWithYield()
    for x in y: print(x)
    
  2. Store the generator results in a data structure on memory or disk which you can iterate over again:

    y = list(FunctionWithYield())
    for x in y: print(x)
    # can iterate again:
    for x in y: print(x)
    

The downside of option 1 is that it computes the values again. If that's CPU-intensive you end up calculating twice. On the other hand, the downside of 2 is the storage. The entire list of values will be stored on memory. If there are too many values, that can be unpractical.

So you have the classic memory vs. processing tradeoff. I can't imagine a way of rewinding the generator without either storing the values or calculating them again.

nosklo
May be exists a way to save signature of function call? FunctionWithYield, param1, param2...
Dewfy
@Dewfy: sure: def call_my_func(): return FunctionWithYield(param1, param2)
nosklo
+1  A: 

Probably the most simple solution is to wrap the expensive part in an object and pass that to the generator:

data = ExpensiveSetup()
for x in FunctionWithYield(data): pass
for x in FunctionWithYield(data): pass

This way, you can cache the expensive calculations.

If you can keep all results in RAM at the same time, then use list() to materialize the results of the generator in a plain list and work with that.

Aaron Digulla
+6  A: 

Another option is to use the itertools.tee() function to create a second version of your generator:

y = FunctionWithYield()
y, y_backup = tee(y)
for x in y:
    print(x)
for x in y_backup:
    print(x)

This could be beneficial from memory usage point of view if the original iteration might not process all the items.

Ants Aasma
If you're wondering about what it will do in this case, it's essentially caching elements in the list. So you might as well use `y = list(y)` with the rest of your code unchanged.
ilya n.
tee() will create a list internally to store the data, so that's the same as I did in my answer.
nosklo
Look at implmentation(http://docs.python.org/library/itertools.html#itertools.tee) - this uses lazy load strategy, so items to list copied only on demand
Dewfy
@Dewfy: Which will be **slower** since all items will have to be copied anyway.
nosklo
yes, list() is better in this case. tee is only useful if you are not consuming the entire list
Brandon Thomson
A: 

I'm not sure what you meant by expensive preparation, but I guess you actually have

data = ... # Expensive computation
y = FunctionWithYield(data)
for x in y: print(x)
#here must be something to reset 'y'
# this is expensive - data = ... # Expensive computation
# y = FunctionWithYield(data)
for x in y: print(x)

If that's the case, why not reuse data?

ilya n.