tags:

views:

98

answers:

2

Given:

x = ['a','b','c','d','e']
y = ['1','2','3']

I'd like iterate resulting in:

a, 1
b, 2
c, 3
d, 1
e, 2
a, 3
b, 1

... where the two iterables cycle independently until a given count.

Python's cycle(iterable) can do this w/ 1 iterable. Functions such as map and itertools.izip_longest can take a function to handle None, but do not provide the built-in auto-repeat.

A not-so-crafty idea is to just concatenate each list to a certain size from which I can iterate evenly. (Boooo!)

Suggestions? Thanks in advance.

+10  A: 

The simplest way to do this is in cyclezip1 below. It is fast enough for most purposes.

import itertools

def cyclezip1(it1, it2, count):
    pairs = itertools.izip(itertools.cycle(iter1),
                           itertools.cycle(iter2))
    return itertools.islice(pairs, 0, count)

Here is another implementation of it that is about twice as fast when count is significantly larger than the least common multiple of it1 and it2.

import fractions

def cyclezip2(co1, co2, count):
    l1 = len(co1)
    l2 = len(co2)
    lcm = l1 * l2 / float(fractions.gcd(l1, l2))
    pairs = itertools.izip(itertools.cycle(co1),
                           itertools.cycle(co2))
    pairs = itertools.islice(pairs, 0, lcm)
    pairs = itertools.cycle(pairs)
    return itertools.islice(pairs, 0, count)

here we take advantage of the fact that pairs will cycle after the first n of them where n is the least common mutliple of len(it1) and len(it2). This of course assumes that the iterables are collections so that asking for the length of them makes any sense. A further optimization that can be made is to replace the line

pairs = itertools.islice(pairs, 0, lcm)

with

pairs = list(itertools.islice(pairs, 0, lcm))

This is not nearly as dramatic of an improvement (about 2% in my testing) and not nearly as consistent. it also requires more memory. If it1 and it2 were known in advance to be small enough so that the additional memory was negligible, then you could squeeze that extra performance out of it.

It's interesting to note that the obvious thing to do in the case of a collection is about four times slower than the first option presented.

def cyclezip3(co1, co2, count):
    l1 = len(co1)
    l2 = len(co2)
    return ((co1[i%l1], co2[i%l2]) for i in xrange(count))
aaronasterling
Thanks for the excellent breakdown!
cwall
+7  A: 
import itertools
x = ['a','b','c','d','e']
y = ['1','2','3']
for a, b in itertools.izip(itertools.cycle(x), itertools.cycle(y)):
    print a, b
Glenn Maynard
Simplicity wins!
cwall