views:

753

answers:

17

I've written a bit of code like the following to compare items with other items further on in a list. Is there a more elegant pattern for this sort of dual iteration?

jump_item_iter = (j for j in items if some_cond)
try:
    jump_item = jump_item_iter.next()
except StopIteration:
    return
for item in items:
    if jump_item is item:
        try:
            jump_item = jump_iter.next()
        except StopIteration:
            return
    # do lots of stuff with item and jump_item

I don't think the "except StopIteration" is very elegant

Edit:

To hopefully make it clearer, I want to visit each item in a list and pair it with the next item further on in the list (jump_item) which satisfies some_cond.

A: 

I have no idea what you're trying to do with that code. But I'm 99% certain that whatever it is could probably be done in 2 lines. I also get the feeling that the '==' operator should be an 'is' operator, otherwise what is the compare() function doing? And what happens if the item returned from the second jump_iter.next call also equals 'item'? It seems like the algorithm would do the wrong thing since you'll compare the second and not the first.

Kylotan
Thanks, I've updated the '==' to be an 'is', that is more correct.The compare function is a placeholder for a big block of code that doesn't have much relevance to the question.
EoghanM
A: 

You could put the whole iteration into a single try structure, that way it would be clearer:

jump_item_iter = (j for j in items if some_cond)
try:
    jump_item = jump_item_iter.next()
    for item in items:
        if jump_item is item:
            jump_item = jump_iter.next()

    # do lots of stuff with item and jump_item

 except StopIteration:
     pass
nikow
I suppose I could better put the question: I want to visit each item in a list and pair it with the next item further on in the list which meets a certain condition.I'll update the question
EoghanM
A: 

So you want to compare pairs of items in the same list, the second item of the pair having to meet some condition. Normally, when you want to compare pairs in a list use zip (or itertools.izip):

for item1, item2 in zip(items, items[1:]):
    compare(item1, item2)

Figure out how to fit your some_cond in this :)

Virgil Dupras
I'm not comparing items[i] to items[i+1], so long as `some_cond` is false, item2 can skip ahead multiple places in the list.
EoghanM
That's why I was talking about figuring out how to fit `some_cond`. something like `others = [item for item in items[1:] if some_cond]` and then `zip(items, others)`.
Virgil Dupras
Isn't the part that he should "figure out" the actual hard part? And why is your approach the right one for this?
nikow
Hi Virgil - if I zip(items, others) then each item2 will be paired with only one item1; in my current implementation it is possible to pair multiple item1s with the same item2 (until item1 reaches item2 at which point item2 jumps ahead again)
EoghanM
True enough. My answer is wrong then. Sorry.
Virgil Dupras
+1  A: 

I have no idea what compare() is doing, but 80% of the time, you can do this with a trivial dictionary or pair of dictionaries. Jumping around in a list is a kind of linear search. Linear Search -- to the extent possible -- should always be replaced with either a direct reference (i.e., a dict) or a tree search (using the bisect module).

S.Lott
I've removed the reference to compare() as it was confusing - I'm actually doing lots of stuff with item and jump_item - compare was a placeholder. This is Linear Search as `items` is definitely a sequence.
EoghanM
A: 

Are you basically trying to compare every item in the iterator with every other item in the original list?

To my mind this should just be a case of using two loops, rather than trying to fit it into one.


filtered_items = (j for j in items if some_cond)
for filtered in filtered_items:
    for item in items:
        if filtered != item:
            compare(filtered, item)

John Montgomery
I'm trying to avoid nested loops as I only need to visit each 'item' once.
EoghanM
A: 

Here is one simple solution that might look a little cleaner:

for i, item in enumerate(items):
    for next_item in items[i+1:]:
        if some_cond(next_item):
            break
    # do some stuff with both items

The disadvantage is that you check the condition for next_item multiple times. But you can easily optimize this:

cond_items = [item if some_cond(item) else None for item in items]
for i, item in enumerate(items):
    for next_item in cond_items[i+1:]:
        if next_item is not None:
            break
    # do some stuff with both items

However, both solutions carry more overhead than the original solution from the question. And when you start using counters to work around this then I think it is better to use the iterator interface directly (as in the original solution).

nikow
+1  A: 

How about this?

paired_values = []
for elmt in reversed(items):
    if <condition>:
        current_val = elmt
    try:
        paired_values.append(current_val)
    except NameError:  # for the last elements of items that don't pass the condition
        pass
paired_values.reverse()

for (item, jump_item) in zip(items, paired_values):  # zip() truncates to len(paired_values)
    # do lots of stuff

If the first element of items matches, then it is used as a jump_item. This is the only difference with your original code (and you might want this behavior).

EOL
A: 

My first answer was wrong because I didn't quite understand what you were trying to achieve. So if I understand correctly (this time, I hope), you want the main for item in items: to "chase" after an iterator that filters out some items. Well, there's not much you can do, except maybe wrap this into a chase_iterator(iterable, some_cond) generator, which would make your main code a little more readable.

Maybe that a more readable approach would be an "accumulator approach" (if the order of the compare() don't matter), like:

others = []
for item in items:
    if some_cond(item):
        for other in others:
            compare(item, other)
        others = []
    else:
        others.append(item)

(man, I'm beginning to hate Stack Overflow... too addictive...)

Virgil Dupras
+1  A: 

The following iterator is time and memory-efficient:

def jump_items(items):
    number_to_be_returned = 0
    for elmt in items:
        if <condition(elmt)>:
            for i in range(number_to_be_returned):
                yield elmt
            number_to_be_returned = 1
        else:
            number_to_be_returned += 1

for (item, jump_item) in zip(items, jump_items(items)):
    # do lots of stuff

Note that you may actually want to set the first number_to_be_returned to 1...

EOL
A: 
for i in range( 0, len( items ) ):
    for j in range( i+1, len( items ) ):
        if some_cond:
            #do something
            #items[i] = item, items[j] = jump_item
spbogie
A: 

With just iterators

def(lst, some_cond):
      jump_item_iter = (j for j in lst if som_cond(j))
      pairs = itertools.izip(lst, lst[1:])
      for last in jump_item_iter:
        for start, start_next in itertools.takewhile(lambda pair: pair[0] < last, pairs):
          yield start, last
        pairs = itertools.chain([(start_next, 'dummy')], pairs)

with the input: range(10) and some_cond = lambda x : x % 2 gives [(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)] (same that your example)

odwl
Here, even the 'batch' (see Ants Aasma solution) is an iterable.
odwl
A: 

Even better using itertools.groupby:

def h(lst, cond):
  remain = lst
  for last in (l for l in lst if cond(l)):
    group = itertools.groupby(remain, key=lambda x: x < last)
    for start in group.next()[1]:
      yield start, last
    remain = list(group.next()[1])

Usage: lst = range(10) cond = lambda x: x%2 print list(h(lst, cond))

will print

[(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)]
odwl
A: 

Maybe it is too late, but what about:

l = [j for j in items if some_cond]
for item, jump_item in zip(l, l[1:]):
    # do lots of stuff with item and jump_item

If l = [j for j in range(10) if j%2 ==0] then the iteration is over: [(0, 2),(2, 4),(4, 6),(6, 8)].

rz
it never is too late, but it's not what OP wants
SilentGhost
With items = range(10) and some_cond=lambda x %2,This will give (1 3), (3 5), (5,7), (7,9)I think that it is not the question you are missing 0,1 for instance
odwl
how so? the first line gets all the items that satisfy some_cond and the loop is over pairs of such items. ie each element that satisfies cond is paired with the next element that satisfies cond.
rz
+1  A: 

Write a generator function:

def myIterator(someValue):
    yield (someValue[0], someValue[1])

for element1, element2 in myIterator(array):
     # do something with those elements.
Georg
+2  A: 

As far as I can see any of the existing solutions work on a general one shot, possiboly infinite iter**ator**, all of them seem to require an iter**able**.

Heres a solution to that.

def batch_by(condition, seq):
    it = iter(seq)
    batch = [it.next()]
    for jump_item in it:
        if condition(jump_item):
            for item in batch:
                yield item, jump_item
            batch = []
        batch.append(jump_item)

This will easily work on infinite iterators:

from itertools import count, islice
is_prime = lambda n: n == 2 or all(n % div for div in xrange(2,n))
print list(islice(batch_by(is_prime, count()), 100))

This will print first 100 integers with the prime number that follows them.

Ants Aasma
Looks like a very good solution. But the batch is not an iterator.
odwl
A: 

You could write your loop body as:

import itertools, functools, operator

for item in items:
    jump_item_iter = itertools.dropwhile(functools.partial(operator.is_, item), 
                                         jump_item_iter)

    # do something with item and jump_item_iter

dropwhile will return an iterator that skips over all those which match the condition (here "is item").

Brian
Are you missing the line `jump_item_iter = items` before the loop body?
EoghanM
It should be the same as in the question before the loop body. ie jump_item_iter = "(j for j in items if some_cond)"
Brian
I think this doesn't work as there is no way to know when to call jump_item_iter.next() and when to reuse the previous value of jump_item_iter.next()
EoghanM
jump_item_iter will be replaced with an iterator which will wrap the previous one - calling next() should have the same effect as having applied the check in your loop, return the next item, skipping over an identical one. However, I might be misinterpreting your code - I've just reread it an noticed that you're using jump_iter in the loop body, which I read as jump_item_iter. If this is indeed something different, then I think I've misunderstood the question.
Brian
A: 

You could do something like:

import itertools

def matcher(iterable, compare):
    iterator= iter(iterable)
    while True:
        try: item= iterator.next()
        except StopIteration: break
        iterator, iterator2= itertools.tee(iterator)
        for item2 in iterator2:
            if compare(item, item2):
                yield item, item2

but it's quite elaborate (and actually not very efficient), and it would be simpler if you just did a

items= list(iterable)

and then just write two loops over items.

Obviously, this won't work with infinite iterables, but your specification can only work on finite iterables.

ΤΖΩΤΖΙΟΥ