ansaurus

Question

Python: Elegant way of dual/multiple iteration over the same list

Answer 1

A:

I have no idea what you're trying to do with that code. But I'm 99% certain that whatever it is could probably be done in 2 lines. I also get the feeling that the '==' operator should be an 'is' operator, otherwise what is the compare() function doing? And what happens if the item returned from the second jump_iter.next call also equals 'item'? It seems like the algorithm would do the wrong thing since you'll compare the second and not the first.

Kylotan 2009-05-15 10:44:14

Thanks, I've updated the '==' to be an 'is', that is more correct.The compare function is a placeholder for a big block of code that doesn't have much relevance to the question.

EoghanM 2009-05-15 11:21:19

Answer 2

A:

You could put the whole iteration into a single try structure, that way it would be clearer:

jump_item_iter = (j for j in items if some_cond)
try:
    jump_item = jump_item_iter.next()
    for item in items:
        if jump_item is item:
            jump_item = jump_iter.next()

    # do lots of stuff with item and jump_item

 except StopIteration:
     pass

nikow 2009-05-15 10:46:34

I suppose I could better put the question: I want to visit each item in a list and pair it with the next item further on in the list which meets a certain condition.I'll update the question

EoghanM 2009-05-15 11:57:04

Answer 3

A:

So you want to compare pairs of items in the same list, the second item of the pair having to meet some condition. Normally, when you want to compare pairs in a list use zip (or itertools.izip):

for item1, item2 in zip(items, items[1:]):
    compare(item1, item2)

Figure out how to fit your some_cond in this :)

Virgil Dupras 2009-05-15 10:46:37

I'm not comparing items[i] to items[i+1], so long as `some_cond` is false, item2 can skip ahead multiple places in the list.

EoghanM 2009-05-15 11:23:11

That's why I was talking about figuring out how to fit `some_cond`. something like `others = [item for item in items[1:] if some_cond]` and then `zip(items, others)`.

Virgil Dupras 2009-05-15 11:34:22

Isn't the part that he should "figure out" the actual hard part? And why is your approach the right one for this?

nikow 2009-05-15 11:40:20

Hi Virgil - if I zip(items, others) then each item2 will be paired with only one item1; in my current implementation it is possible to pair multiple item1s with the same item2 (until item1 reaches item2 at which point item2 jumps ahead again)

EoghanM 2009-05-15 11:51:34

True enough. My answer is wrong then. Sorry.

Virgil Dupras 2009-05-15 12:00:02

Answer 4

+1 A:

I have no idea what compare() is doing, but 80% of the time, you can do this with a trivial dictionary or pair of dictionaries. Jumping around in a list is a kind of linear search. Linear Search -- to the extent possible -- should always be replaced with either a direct reference (i.e., a dict) or a tree search (using the bisect module).

S.Lott 2009-05-15 10:47:45

I've removed the reference to compare() as it was confusing - I'm actually doing lots of stuff with item and jump_item - compare was a placeholder. This is Linear Search as `items` is definitely a sequence.

EoghanM 2009-05-15 11:29:37

Answer 5

A:

Are you basically trying to compare every item in the iterator with every other item in the original list?

To my mind this should just be a case of using two loops, rather than trying to fit it into one.


filtered_items = (j for j in items if some_cond)
for filtered in filtered_items:
    for item in items:
        if filtered != item:
            compare(filtered, item)

John Montgomery 2009-05-15 11:31:25

I'm trying to avoid nested loops as I only need to visit each 'item' once.

EoghanM 2009-05-15 11:55:21

Answer 6

A:

Here is one simple solution that might look a little cleaner:

for i, item in enumerate(items):
    for next_item in items[i+1:]:
        if some_cond(next_item):
            break
    # do some stuff with both items

The disadvantage is that you check the condition for next_item multiple times. But you can easily optimize this:

cond_items = [item if some_cond(item) else None for item in items]
for i, item in enumerate(items):
    for next_item in cond_items[i+1:]:
        if next_item is not None:
            break
    # do some stuff with both items

However, both solutions carry more overhead than the original solution from the question. And when you start using counters to work around this then I think it is better to use the iterator interface directly (as in the original solution).

nikow 2009-05-15 12:19:15

Answer 7

+1 A:

How about this?

paired_values = []
for elmt in reversed(items):
    if <condition>:
        current_val = elmt
    try:
        paired_values.append(current_val)
    except NameError:  # for the last elements of items that don't pass the condition
        pass
paired_values.reverse()

for (item, jump_item) in zip(items, paired_values):  # zip() truncates to len(paired_values)
    # do lots of stuff

If the first element of items matches, then it is used as a jump_item. This is the only difference with your original code (and you might want this behavior).

EOL 2009-05-15 12:26:24

Answer 8

A:

My first answer was wrong because I didn't quite understand what you were trying to achieve. So if I understand correctly (this time, I hope), you want the main for item in items: to "chase" after an iterator that filters out some items. Well, there's not much you can do, except maybe wrap this into a chase_iterator(iterable, some_cond) generator, which would make your main code a little more readable.

Maybe that a more readable approach would be an "accumulator approach" (if the order of the compare() don't matter), like:

others = []
for item in items:
    if some_cond(item):
        for other in others:
            compare(item, other)
        others = []
    else:
        others.append(item)

(man, I'm beginning to hate Stack Overflow... too addictive...)

Virgil Dupras 2009-05-15 12:29:29

Answer 9

+1 A:

The following iterator is time and memory-efficient:

def jump_items(items):
    number_to_be_returned = 0
    for elmt in items:
        if <condition(elmt)>:
            for i in range(number_to_be_returned):
                yield elmt
            number_to_be_returned = 1
        else:
            number_to_be_returned += 1

for (item, jump_item) in zip(items, jump_items(items)):
    # do lots of stuff

Note that you may actually want to set the first number_to_be_returned to 1...

EOL 2009-05-15 12:40:49

Answer 10

A:

for i in range( 0, len( items ) ):
    for j in range( i+1, len( items ) ):
        if some_cond:
            #do something
            #items[i] = item, items[j] = jump_item

spbogie 2009-05-15 12:49:50

Answer 11

A:

With just iterators

def(lst, some_cond):
      jump_item_iter = (j for j in lst if som_cond(j))
      pairs = itertools.izip(lst, lst[1:])
      for last in jump_item_iter:
        for start, start_next in itertools.takewhile(lambda pair: pair[0] < last, pairs):
          yield start, last
        pairs = itertools.chain([(start_next, 'dummy')], pairs)

with the input: range(10) and some_cond = lambda x : x % 2 gives [(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)] (same that your example)

odwl 2009-05-15 13:19:04

Here, even the 'batch' (see Ants Aasma solution) is an iterable.

odwl 2009-05-15 16:42:30

Answer 12

A:

Even better using itertools.groupby:

def h(lst, cond):
  remain = lst
  for last in (l for l in lst if cond(l)):
    group = itertools.groupby(remain, key=lambda x: x < last)
    for start in group.next()[1]:
      yield start, last
    remain = list(group.next()[1])

Usage: lst = range(10) cond = lambda x: x%2 print list(h(lst, cond))

will print

[(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)]

odwl 2009-05-15 13:46:05

Answer 13

A:

Maybe it is too late, but what about:

l = [j for j in items if some_cond]
for item, jump_item in zip(l, l[1:]):
    # do lots of stuff with item and jump_item

If l = [j for j in range(10) if j%2 ==0] then the iteration is over: [(0, 2),(2, 4),(4, 6),(6, 8)].

rz 2009-05-15 13:47:15

it never is too late, but it's not what OP wants

SilentGhost 2009-05-15 13:48:57

With items = range(10) and some_cond=lambda x %2,This will give (1 3), (3 5), (5,7), (7,9)I think that it is not the question you are missing 0,1 for instance

odwl 2009-05-15 13:54:31

how so? the first line gets all the items that satisfy some_cond and the loop is over pairs of such items. ie each element that satisfies cond is paired with the next element that satisfies cond.

rz 2009-05-15 13:54:32

Answer 14

+1 A:

Write a generator function:

def myIterator(someValue):
    yield (someValue[0], someValue[1])

for element1, element2 in myIterator(array):
     # do something with those elements.

Georg 2009-05-15 14:03:58

Answer 15

+2 A:

As far as I can see any of the existing solutions work on a general one shot, possiboly infinite iter**ator**, all of them seem to require an iter**able**.

Heres a solution to that.

def batch_by(condition, seq):
    it = iter(seq)
    batch = [it.next()]
    for jump_item in it:
        if condition(jump_item):
            for item in batch:
                yield item, jump_item
            batch = []
        batch.append(jump_item)

This will easily work on infinite iterators:

from itertools import count, islice
is_prime = lambda n: n == 2 or all(n % div for div in xrange(2,n))
print list(islice(batch_by(is_prime, count()), 100))

This will print first 100 integers with the prime number that follows them.

Ants Aasma 2009-05-15 15:49:57

Looks like a very good solution. But the batch is not an iterator.

odwl 2009-05-15 16:43:25

Answer 16

A:

You could write your loop body as:

import itertools, functools, operator

for item in items:
    jump_item_iter = itertools.dropwhile(functools.partial(operator.is_, item), 
                                         jump_item_iter)

    # do something with item and jump_item_iter

dropwhile will return an iterator that skips over all those which match the condition (here "is item").

Brian 2009-05-15 17:48:53

Are you missing the line `jump_item_iter = items` before the loop body?

EoghanM 2009-05-20 16:28:09

It should be the same as in the question before the loop body. ie jump_item_iter = "(j for j in items if some_cond)"

Brian 2009-05-20 18:46:52

I think this doesn't work as there is no way to know when to call jump_item_iter.next() and when to reuse the previous value of jump_item_iter.next()

EoghanM 2009-05-21 19:19:13

jump_item_iter will be replaced with an iterator which will wrap the previous one - calling next() should have the same effect as having applied the check in your loop, return the next item, skipping over an identical one. However, I might be misinterpreting your code - I've just reread it an noticed that you're using jump_iter in the loop body, which I read as jump_item_iter. If this is indeed something different, then I think I've misunderstood the question.

Brian 2009-05-22 09:22:22

Answer 17

A:

You could do something like:

import itertools

def matcher(iterable, compare):
    iterator= iter(iterable)
    while True:
        try: item= iterator.next()
        except StopIteration: break
        iterator, iterator2= itertools.tee(iterator)
        for item2 in iterator2:
            if compare(item, item2):
                yield item, item2

but it's quite elaborate (and actually not very efficient), and it would be simpler if you just did a

items= list(iterable)

and then just write two loops over items.

Obviously, this won't work with infinite iterables, but your specification can only work on finite iterables.

ΤΖΩΤΖΙΟΥ 2009-05-15 21:41:22

ansaurus

tags:

views:

answers:

Python: Elegant way of dual/multiple iteration over the same list

related questions