views:

592

answers:

8

I'd like to know the best way (more compact and "pythonic" way) to do a special treatment for the last element in a for loop. There is a piece of code that should be called only between elements, being suppressed in the last one. Here is how I currently do it:

for i, data in enumerate(data_list):
    code_that_is_done_for_every_element
    code_that_is_done_for_every_element
    code_that_is_done_for_every_element

    if i != len(data_list) - 1:
        code_that_is_done_between_elements
        code_that_is_done_between_elements
        code_that_is_done_between_elements

Is there any better way?

Note: I don't want to make it with hacks such as using reduce ;)

+12  A: 

Most of the times it is easier (and cheaper) to make the first iteration the special case instead of the last one:

first = True
for data in data_list:
    if first:
        first = False
    else:
        between_items()

    item()

This will work for any iterable, even for those that have no len():

file = open('/path/to/file')
for line in file:
    process_line(line)

    # No way of telling if this is the last line!

Apart from that, I don't think there is a generally superior solution as it depends on what you are trying to do. For example, if you are building a string from a list, it's naturally better to use str.join() than using a for loop “with special case”.


I liked your way to do it, only if it were more compact... ;)

Using the same principle but more compact:

for i, line in enumerate(data_list):
    if i > 0:
        between_items()
    item()

Looks familiar, doesn't it? :)

Ferdinand Beyer
True, this way seems better than mine, at least it don't need to use enumerate and len.
e.tadeu
Yes, but it adds another `if` which could be avoided if the loop was split into two loops. However, this is relevant only when iterating a huge data list.
Adam Matan
The problem with splitting into two loops is that it either violates DRY or it forces you to define methods.
e.tadeu
I liked your way to do it, only if it were more compact... ;)
e.tadeu
+6  A: 

If you're simply looking to modify the last element in data_list then you can simply use the notation:

L[-1]

However, it looks like you're doing more than that. There is nothing really wrong with your way. I even took a quick glance at some Django code for their template tags and they do basically what you're doing.

Bartek
I'm not modifying it, I'm using it to do something
e.tadeu
+1  A: 

Is there no possibility to iterate over all-but the last element, and treat the last one outside of the loop? After all, a loop is created to do something similar to all elements you loop over; if one element needs something special, it shouldn't be in the loop.

(see also this question: does-the-last-element-in-a-loop-deserve-a-separate-treatment)

EDIT: since the question is more about the "in between", either the first element is the special one in that it has no predecessor, or the last element is special in that it has no successor.

xtofl
But the last element should be treated similar to every other element in the list. The problem is the thing that should be done only *between* elements.
e.tadeu
In that case, the first one is the only one without a predecessor. Take that one apart, and loop over the rest of the list general code.
xtofl
A: 

There is nothing wrong with your way, unless you will have 100 000 loops and wants save 100 000 "if" statements. In that case, you can go that way :

iterable = [1,2,3] # Your date
iterator = iter(iterable) # get the data iterator

try :   # wrap all in a try / except
    while 1 : 
        item = iterator.next() 
        print item # put the "for loop" code here
except StopIteration, e : # make the process on the last element here
    print item

Outputs :

1
2
3
3

But really, in your case I feel like it's overkill.

In any case, you will probably be luckier with slicing :

for item in iterable[:-1] :
    print item
print "last :", iterable[-1]

#outputs
1
2
last : 3

or just :

for item in iterable :
    print item
print iterable[-1]

#outputs
1
2
3
last : 3

Eventually, a KISS way to do you stuff, and that would work with any iterable, including the ones without __len__ :

item = ''
for item in iterable :
    print item
print item

Ouputs:

1
2
3
3

If feel like I would do it that way, seems simple to me.

e-satis
But note that iterable[-1] will not work to *all* iterables (such as generator that do not have __len__)
e.tadeu
Right, I edited it to add the hack I would use myself. Not the brightest, but definitly the easiest.
e-satis
If all you want is to access the last item after the loop, simply use `item` instead of re-calculating it using `list[-1]`. But nevertheless: I don't think this is what the OP was asking for, was it?
Ferdinand Beyer
Compensated the -1, it was not said in the problem definition that len was to be avoided (even used in the example).
RedGlyph
Maybe I didn't get the question then.
e-satis
@e-satis: the comment wasn't meant for you ;-)
RedGlyph
Haven't sleep enought. Should never get to SO before 6 good hours of sleep.
e-satis
Re: `iterable.__iter__() ` - please don't call `__` functions directly. Should be `iter(iterable)`.
Paul McGuire
+1  A: 

You can use a sliding window over the input data to get a peek at the next value and use a sentinel to detect the last value. This works on any iterable, so you don't need to know the length beforehand. The pairwise implementation is from itertools recipes.

from itertools import tee, izip, chain

def pairwise(seq):
    a,b = tee(seq)
    next(b, None)
    return izip(a,b)

def annotated_last(seq):
    """Returns an iterable of pairs of input item and a boolean that show if
    the current item is the last item in the sequence."""
    MISSING = object()
    for current_item, next_item in pairwise(chain(seq, [MISSING])):
        yield current_item, next_item is MISSING:

for item, is_last_item in annotated_last(data_list):
    if is_last_item:
        # current item is the last item
Ants Aasma
+7  A: 

The 'code between' is an example of the Head-Tail pattern.

You have an item, which is followed by a sequence of ( between, item ) pairs. You can also view this as a sequence of (item, between) pairs followed by an item. It's generally simpler to take the first element as special and all the others as the "standard" case.

Further, to avoid repeating code, you have to provide a function or other object to contain the code you don't want to repeat. Embedding an if statement in a loop which is always false except one time is kind of silly.

def item_processing( item ):
    # *the common processing*

head_tail_iter = iter( someSequence )
head = head_tail_iter.next()
item_processing( head )
for item in head_tail_iter:
    # *the between processing*
    item_processing( item )

This is more reliable because it's slightly easier to prove, It doesn't create an extra data structure (i.e., a copy of a list) and doesn't require a lot of wasted execution of an if condition which is always false except once.

S.Lott
Function calls are way slower then `if` statements so the “wasted execution” argument does not hold.
Ferdinand Beyer
I'm not sure what the speed difference between function call and if-statement has to do with anything. The point is that this formulation has no if-statement that's always false (except once.)
S.Lott
I interpreted your statement “…and doesn't require a lot of wasted execution of an if condition which is always false except once” as “…and is faster since it saves a couple of `if`s”. Obviously you are just refering to “code cleanliness”?
Ferdinand Beyer
A: 

Assuming input as an iterator, here's a way using tee and izip from itertools:

from itertools import tee, izip
items, between = tee(input_iterator, 2)  # Input must be an iterator.
first = items.next()
do_to_every_item(first)  # All "do to every" operations done to first item go here.
for i, b in izip(items, between):
    do_between_items(b)  # All "between" operations go here.
    do_to_every_item(i)  # All "do to every" operations go here.

Demo:

>>> def do_every(x): print "E", x
...
>>> def do_between(x): print "B", x
...
>>> test_input = iter(range(5))
>>>
>>> from itertools import tee, izip
>>>
>>> items, between = tee(test_input, 2)
>>> first = items.next()
>>> do_every(first)
E 0
>>> for i,b in izip(items, between):
...     do_between(b)
...     do_every(i)
...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4
>>>
Anon
+3  A: 

This is similar to Ants Aasma's approach but without using the itertools module. It's also a lagging iterator which looks-ahead a single element in the iterator stream:

def last_iter(it):
    # Ensure it's an iterator and get the first field
    it = iter(it)
    prev = next(it)
    for item in it:
        # Lag by one item so I know I'm not at the end
        yield 0, prev
        prev = item
    # Last item
    yield 1, prev

def test(data):
    result = list(last_iter(data))
    if not result:
        return
    if len(result) > 1:
        assert set(x[0] for x in result[:-1]) == set([0]), result
    assert result[-1][0] == 1

test([])
test([1])
test([1, 2])
test(range(5))
test(xrange(4))

for is_last, item in last_iter("Hi!"):
    print is_last, item
Andrew Dalke
I liked this solution too!
e.tadeu